commit 810a6ddb757f13870fbc89efd8467ba114d8ad32 Author: Alexandre Frade Date: Sat Feb 15 09:59:30 2020 -0300 5.4.20-xanmod11 Signed-off-by: Alexandre Frade commit 639898ce1c8bc9e0b29816ffe93b8092a968aad6 Merge: c8f256bdc65d 27dfbcc2f53d Author: Alexandre Frade Date: Sat Feb 15 09:57:33 2020 -0300 Merge tag 'v5.4.20' into 5.4 This is the 5.4.20 stable release commit 27dfbcc2f53d5b14ef78156d15ff92619807d46c Author: Greg Kroah-Hartman Date: Fri Feb 14 16:34:20 2020 -0500 Linux 5.4.20 commit 2d8fdc5744ff678e23d8869f57cedab4f5546f74 Author: Stephen Smalley Date: Fri Nov 22 12:22:45 2019 -0500 selinux: fall back to ref-walk if audit is required commit 0188d5c025ca8fe756ba3193bd7d150139af5a88 upstream. commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") passed down the rcu flag to the SELinux AVC, but failed to adjust the test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. Previously, we only returned -ECHILD if generating an audit record with LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. Move the handling of MAY_NOT_BLOCK to avc_audit() and its inlined equivalent in selinux_inode_permission() immediately after we determine that audit is required, and always fall back to ref-walk in this case. Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") Reported-by: Will Deacon Suggested-by: Al Viro Signed-off-by: Stephen Smalley Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit ae7f404d922747e9e97679d5bbde51e0a72693ca Author: Nicolai Stange Date: Tue Jan 14 11:39:03 2020 +0100 libertas: make lbs_ibss_join_existing() return error code on rates overflow [ Upstream commit 1754c4f60aaf1e17d886afefee97e94d7f27b4cb ] Commit e5e884b42639 ("libertas: Fix two buffer overflows at parsing bss descriptor") introduced a bounds check on the number of supplied rates to lbs_ibss_join_existing() and made it to return on overflow. However, the aforementioned commit doesn't set the return value accordingly and thus, lbs_ibss_join_existing() would return with zero even though it failed. Make lbs_ibss_join_existing return -EINVAL in case the bounds check on the number of supplied rates fails. Fixes: e5e884b42639 ("libertas: Fix two buffer overflows at parsing bss descriptor") Signed-off-by: Nicolai Stange Signed-off-by: Kalle Valo Signed-off-by: Sasha Levin commit 61087dce64a53f13b8cc162032f25ef80f50b15c Author: Nicolai Stange Date: Tue Jan 14 11:39:02 2020 +0100 libertas: don't exit from lbs_ibss_join_existing() with RCU read lock held [ Upstream commit c7bf1fb7ddca331780b9a733ae308737b39f1ad4 ] Commit e5e884b42639 ("libertas: Fix two buffer overflows at parsing bss descriptor") introduced a bounds check on the number of supplied rates to lbs_ibss_join_existing(). Unfortunately, it introduced a return path from within a RCU read side critical section without a corresponding rcu_read_unlock(). Fix this. Fixes: e5e884b42639 ("libertas: Fix two buffer overflows at parsing bss descriptor") Signed-off-by: Nicolai Stange Signed-off-by: Kalle Valo Signed-off-by: Sasha Levin commit 3c822e1f31186767d6b7261c3c066f01907ecfca Author: Qing Xu Date: Thu Jan 2 10:39:27 2020 +0800 mwifiex: Fix possible buffer overflows in mwifiex_cmd_append_vsie_tlv() [ Upstream commit b70261a288ea4d2f4ac7cd04be08a9f0f2de4f4d ] mwifiex_cmd_append_vsie_tlv() calls memcpy() without checking the destination size may trigger a buffer overflower, which a local user could use to cause denial of service or the execution of arbitrary code. Fix it by putting the length check before calling memcpy(). Signed-off-by: Qing Xu Signed-off-by: Kalle Valo Signed-off-by: Sasha Levin commit c5b071e3f44d1125694ad4dcf1234fb9a78d0be6 Author: Qing Xu Date: Thu Jan 2 10:39:26 2020 +0800 mwifiex: Fix possible buffer overflows in mwifiex_ret_wmm_get_status() [ Upstream commit 3a9b153c5591548612c3955c9600a98150c81875 ] mwifiex_ret_wmm_get_status() calls memcpy() without checking the destination size.Since the source is given from remote AP which contains illegal wmm elements , this may trigger a heap buffer overflow. Fix it by putting the length check before calling memcpy(). Signed-off-by: Qing Xu Signed-off-by: Kalle Valo Signed-off-by: Sasha Levin commit 2cf2b620af7b59269b3fe24182b18e119f03628f Author: Chuhong Yuan Date: Mon Dec 9 16:57:11 2019 +0800 dmaengine: axi-dmac: add a check for devm_regmap_init_mmio commit a5b982af953bcc838cd198b0434834cc1dff14ec upstream. The driver misses checking the result of devm_regmap_init_mmio(). Add a check to fix it. Fixes: fc15be39a827 ("dmaengine: axi-dmac: add regmap support") Signed-off-by: Chuhong Yuan Reviewed-by: Alexandru Ardelean Link: https://lore.kernel.org/r/20191209085711.16001-1-hslester96@gmail.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit eada328f7f9bc4807fdf071861558fb131c09f85 Author: Jerome Brunet Date: Fri Dec 13 11:33:04 2019 +0100 clk: meson: g12a: fix missing uart2 in regmap table commit b1b3f0622a9d52ac19a63619911823c89a4d85a4 upstream. UART2 peripheral is missing from the regmap fixup table of the g12a family clock controller. As it is, any access to this clock would Oops, which is not great. Add the clock to the table to fix the problem. Fixes: 085a4ea93d54 ("clk: meson: g12a: add peripheral clock controller") Reported-by: Dmitry Shmidt Tested-by: Dmitry Shmidt Acked-by: Neil Armstrong Tested-by: Kevin Hilman Signed-off-by: Jerome Brunet Signed-off-by: Greg Kroah-Hartman commit 3cfb0b360b377d3b712e91b1e7c53766eda50412 Author: Bartosz Golaszewski Date: Fri Jan 3 12:41:56 2020 +0100 mfd: max77650: Select REGMAP_IRQ in Kconfig commit cb7a374a5e7a5af3f8c839f74439193add6d0589 upstream. MAX77650 MFD driver uses regmap_irq API but doesn't select the required REGMAP_IRQ option in Kconfig. This can cause the following build error if regmap irq is not enabled implicitly by someone else: ld: drivers/mfd/max77650.o: in function `max77650_i2c_probe': max77650.c:(.text+0xcb): undefined reference to `devm_regmap_add_irq_chip' ld: max77650.c:(.text+0xdb): undefined reference to `regmap_irq_get_domain' make: *** [Makefile:1079: vmlinux] Error 1 Fix it by adding the missing option. Fixes: d0f60334500b ("mfd: Add new driver for MAX77650 PMIC") Reported-by: Paul Gazzillo Signed-off-by: Bartosz Golaszewski Signed-off-by: Lee Jones Signed-off-by: Greg Kroah-Hartman commit 3b9586e82c90d34caa30f6b763820eddce93ae44 Author: Ben Whitten Date: Sat Jan 18 20:56:24 2020 +0000 regmap: fix writes to non incrementing registers commit 2e31aab08bad0d4ee3d3d890a7b74cb6293e0a41 upstream. When checking if a register block is writable we must ensure that the block does not start with or contain a non incrementing register. Fixes: 8b9f9d4dc511 ("regmap: verify if register is writeable before writing operations") Signed-off-by: Ben Whitten Link: https://lore.kernel.org/r/20200118205625.14532-1-ben.whitten@gmail.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 4eb12ef7491d76506549b61704c6a62cc545a307 Author: Geert Uytterhoeven Date: Wed Dec 18 20:48:07 2019 +0100 pinctrl: sh-pfc: r8a7778: Fix duplicate SDSELF_B and SD1_CLK_B commit 805f635703b2562b5ddd822c62fc9124087e5dd5 upstream. The FN_SDSELF_B and FN_SD1_CLK_B enum IDs are used twice, which means one set of users must be wrong. Replace them by the correct enum IDs. Fixes: 87f8c988636db0d4 ("sh-pfc: Add r8a7778 pinmux support") Signed-off-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20191218194812.12741-2-geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman commit 7b07d15aa5b46c82bffa18357c3c0eef2adac569 Author: Geert Uytterhoeven Date: Wed Nov 13 11:16:53 2019 +0100 pinctrl: sh-pfc: r8a77965: Fix DU_DOTCLKIN3 drive/bias control commit a34cd9dfd03fa9ec380405969f1d638bc63b8d63 upstream. R-Car Gen3 Hardware Manual Errata for Rev. 2.00 of October 24, 2019 changed the configuration bits for drive and bias control for the DU_DOTCLKIN3 pin on R-Car M3-N, to match the same pin on R-Car H3. Update the driver to reflect this. After this, the handling of drive and bias control for the various DU_DOTCLKINx pins is consistent across all of the R-Car H3, M3-W, M3-W+, and M3-N SoCs. Fixes: 86c045c2e4201e94 ("pinctrl: sh-pfc: r8a77965: Replace DU_DOTCLKIN2 by DU_DOTCLKIN3") Signed-off-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20191113101653.28428-1-geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman commit 875e01dd8a972115b58589e4715dec046eb51c0f Author: Stephen Smalley Date: Fri Jan 17 15:24:07 2020 -0500 selinux: fix regression introduced by move_mount(2) syscall commit 98aa00345de54b8340dc2ddcd87f446d33387b5e upstream. commit 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around") introduced a new move_mount(2) system call and a corresponding new LSM security_move_mount hook but did not implement this hook for any existing LSM. This creates a regression for SELinux with respect to consistent checking of mounts; the existing selinux_mount hook checks mounton permission to the mount point path. Provide a SELinux hook implementation for move_mount that applies this same check for consistency. In the future we may wish to add a new move_mount filesystem permission and check as well, but this addresses the immediate regression. Fixes: 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around") Signed-off-by: Stephen Smalley Reviewed-by: Ondrej Mosnacek Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit 3b2e595dfe2bbbedee5cf2298739d9b6663e7d6d Author: Stephen Smalley Date: Fri Nov 22 12:22:44 2019 -0500 selinux: revert "stop passing MAY_NOT_BLOCK to the AVC upon follow_link" commit 1a37079c236d55fb31ebbf4b59945dab8ec8764c upstream. This reverts commit e46e01eebbbc ("selinux: stop passing MAY_NOT_BLOCK to the AVC upon follow_link"). The correct fix is to instead fall back to ref-walk if audit is required irrespective of the specific audit data type. This is done in the next commit. Fixes: e46e01eebbbc ("selinux: stop passing MAY_NOT_BLOCK to the AVC upon follow_link") Reported-by: Will Deacon Signed-off-by: Stephen Smalley Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit 837c36e0451f14225e684bb4942a9f4b4eed3cb7 Author: Coly Li Date: Fri Jan 24 01:01:37 2020 +0800 bcache: avoid unnecessary btree nodes flushing in btree_flush_write() commit 2aa8c529387c25606fdc1484154b92f8bfbc5746 upstream. the commit 91be66e1318f ("bcache: performance improvement for btree_flush_write()") was an effort to flushing btree node with oldest btree node faster in following methods, - Only iterate dirty btree nodes in c->btree_cache, avoid scanning a lot of clean btree nodes. - Take c->btree_cache as a LRU-like list, aggressively flushing all dirty nodes from tail of c->btree_cache util the btree node with oldest journal entry is flushed. This is to reduce the time of holding c->bucket_lock. Guoju Fang and Shuang Li reported that they observe unexptected extra write I/Os on cache device after applying the above patch. Guoju Fang provideed more detailed diagnose information that the aggressive btree nodes flushing may cause 10x more btree nodes to flush in his workload. He points out when system memory is large enough to hold all btree nodes in memory, c->btree_cache is not a LRU-like list any more. Then the btree node with oldest journal entry is very probably not- close to the tail of c->btree_cache list. In such situation much more dirty btree nodes will be aggressively flushed before the target node is flushed. When slow SATA SSD is used as cache device, such over- aggressive flushing behavior will cause performance regression. After spending a lot of time on debug and diagnose, I find the real condition is more complicated, aggressive flushing dirty btree nodes from tail of c->btree_cache list is not a good solution. - When all btree nodes are cached in memory, c->btree_cache is not a LRU-like list, the btree nodes with oldest journal entry won't be close to the tail of the list. - There can be hundreds dirty btree nodes reference the oldest journal entry, before flushing all the nodes the oldest journal entry cannot be reclaimed. When the above two conditions mixed together, a simply flushing from tail of c->btree_cache list is really NOT a good idea. Fortunately there is still chance to make btree_flush_write() work better. Here is how this patch avoids unnecessary btree nodes flushing, - Only acquire c->journal.lock when getting oldest journal entry of fifo c->journal.pin. In rested locations check the journal entries locklessly, so their values can be changed on other cores in parallel. - In loop list_for_each_entry_safe_reverse(), checking latest front point of fifo c->journal.pin. If it is different from the original point which we get with locking c->journal.lock, it means the oldest journal entry is reclaim on other cores. At this moment, all selected dirty nodes recorded in array btree_nodes[] are all flushed and clean on other CPU cores, it is unncessary to iterate c->btree_cache any longer. Just quit the list_for_each_entry_safe_reverse() loop and the following for-loop will skip all the selected clean nodes. - Find a proper time to quit the list_for_each_entry_safe_reverse() loop. Check the refcount value of orignial fifo front point, if the value is larger than selected node number of btree_nodes[], it means more matching btree nodes should be scanned. Otherwise it means no more matching btee nodes in rest of c->btree_cache list, the loop can be quit. If the original oldest journal entry is reclaimed and fifo front point is updated, the refcount of original fifo front point will be 0, then the loop will be quit too. - Not hold c->bucket_lock too long time. c->bucket_lock is also required for space allocation for cached data, hold it for too long time will block regular I/O requests. When iterating list c->btree_cache, even there are a lot of maching btree nodes, in order to not holding c->bucket_lock for too long time, only BTREE_FLUSH_NR nodes are selected and to flush in following for-loop. With this patch, only btree nodes referencing oldest journal entry are flushed to cache device, no aggressive flushing for unnecessary btree node any more. And in order to avoid blocking regluar I/O requests, each time when btree_flush_write() called, at most only BTREE_FLUSH_NR btree nodes are selected to flush, even there are more maching btree nodes in list c->btree_cache. At last, one more thing to explain: Why it is safe to read front point of c->journal.pin without holding c->journal.lock inside the list_for_each_entry_safe_reverse() loop ? Here is my answer: When reading the front point of fifo c->journal.pin, we don't need to know the exact value of front point, we just want to check whether the value is different from the original front point (which is accurate value because we get it while c->jouranl.lock is held). For such purpose, it works as expected without holding c->journal.lock. Even the front point is changed on other CPU core and not updated to local core, and current iterating btree node has identical journal entry local as original fetched fifo front point, it is still safe. Because after holding mutex b->write_lock (with memory barrier) this btree node can be found as clean and skipped, the loop will quite latter when iterate on next node of list c->btree_cache. Fixes: 91be66e1318f ("bcache: performance improvement for btree_flush_write()") Reported-by: Guoju Fang Reported-by: Shuang Li Signed-off-by: Coly Li Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 7c71d438e7e58a6889b89889282f17eed7e6958e Author: Beniamin Bia Date: Tue Jan 14 15:24:01 2020 +0200 dt-bindings: iio: adc: ad7606: Fix wrong maxItems value commit a6c4f77cb3b11f81077b53c4a38f21b92d41f21e upstream. This patch set the correct value for oversampling maxItems. In the original example, appears 3 items for oversampling while the maxItems is set to 1, this patch fixes those issues. Fixes: 416f882c3b40 ("dt-bindings: iio: adc: Migrate AD7606 documentation to yaml") Signed-off-by: Beniamin Bia Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit d15a2930f6d7e0d53533f07892a4047817a025a2 Author: Gustavo A. R. Silva Date: Tue Oct 22 15:25:22 2019 +0200 media: i2c: adv748x: Fix unsafe macros commit 0d962e061abcf1b9105f88fb850158b5887fbca3 upstream. Enclose multiple macro parameters in parentheses in order to make such macros safer and fix the Clang warning below: drivers/media/i2c/adv748x/adv748x-afe.c:452:12: warning: operator '?:' has lower precedence than '|'; '|' will be evaluated first [-Wbitwise-conditional-parentheses] ret = sdp_clrset(state, ADV748X_SDP_FRP, ADV748X_SDP_FRP_MASK, enable ? ctrl->val - 1 : 0); Fixes: 3e89586a64df ("media: i2c: adv748x: add adv748x driver") Reported-by: Dmitry Vyukov Signed-off-by: Gustavo A. R. Silva Reviewed-by: Kieran Bingham Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 30dd20c6d0e34d99a0a2b697b2f351fc2658f2c8 Author: Christophe Roullier Date: Fri Nov 22 14:22:46 2019 +0100 drivers: watchdog: stm32_iwdg: set WDOG_HW_RUNNING at probe commit 85fdc63fe256b595f923a69848cd99972ff446d8 upstream. If the watchdog hardware is already enabled during the boot process, when the Linux watchdog driver loads, it should start/reset the watchdog and tell the watchdog framework. As a result, ping can be generated from the watchdog framework (if CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED is set), until the userspace watchdog daemon takes over control Fixes:4332d113c66a ("watchdog: Add STM32 IWDG driver") Signed-off-by: Christophe Roullier Reviewed-by: Guenter Roeck Link: https://lore.kernel.org/r/20191122132246.8473-1-christophe.roullier@st.com Signed-off-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck Signed-off-by: Greg Kroah-Hartman commit 8dfa11837606a7f46735207fe4382d5f2441db7d Author: Horia Geantă Date: Mon Jan 13 10:54:35 2020 +0200 crypto: caam/qi2 - fix typo in algorithm's driver name commit 53146d152510584c2034c62778a7cbca25743ce9 upstream. Fixes: 8d818c105501 ("crypto: caam/qi2 - add DPAA2-CAAM driver") Signed-off-by: Horia Geantă Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 1f42c6de7b941cfe42d801ead31aa7ca342a21c1 Author: Eric Biggers Date: Mon Dec 30 21:19:33 2019 -0600 crypto: atmel-sha - fix error handling when setting hmac key commit b529f1983b2dcc46354f311feda92e07b6e9e2da upstream. HMAC keys can be of any length, and atmel_sha_hmac_key_set() can only fail due to -ENOMEM. But atmel_sha_hmac_setkey() incorrectly treated any error as a "bad key length" error. Fix it to correctly propagate the -ENOMEM error code and not set any tfm result flags. Fixes: 81d8750b2b59 ("crypto: atmel-sha - add support to hmac(shaX)") Cc: Nicolas Ferre Cc: Alexandre Belloni Cc: Ludovic Desroches Signed-off-by: Eric Biggers Reviewed-by: Tudor Ambarus Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit fb42d3f4ec8677deee363e672d5a3fbba9a865f8 Author: Eric Biggers Date: Mon Dec 30 21:19:32 2019 -0600 crypto: artpec6 - return correct error code for failed setkey() commit b828f905904cd76424230c69741a4cabb0174168 upstream. ->setkey() is supposed to retun -EINVAL for invalid key lengths, not -1. Fixes: a21eb94fc4d3 ("crypto: axis - add ARTPEC-6/7 crypto accelerator driver") Cc: Jesper Nilsson Cc: Lars Persson Signed-off-by: Eric Biggers Acked-by: Lars Persson Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit ee1c6b1aa9dc791d433e95536d42a6bbd4b00ae6 Author: Eric Biggers Date: Sun Dec 1 13:53:26 2019 -0800 crypto: testmgr - don't try to decrypt uninitialized buffers commit eb455dbd02cb1074b37872ffca30a81cb2a18eaa upstream. Currently if the comparison fuzz tests encounter an encryption error when generating an skcipher or AEAD test vector, they will still test the decryption side (passing it the uninitialized ciphertext buffer) and expect it to fail with the same error. This is sort of broken because it's not well-defined usage of the API to pass an uninitialized buffer, and furthermore in the AEAD case it's acceptable for the decryption error to be EBADMSG (meaning "inauthentic input") even if the encryption error was something else like EINVAL. Fix this for skcipher by explicitly initializing the ciphertext buffer on error, and for AEAD by skipping the decryption test on error. Reported-by: Pascal Van Leeuwen Fixes: d435e10e67be ("crypto: testmgr - fuzz skciphers against their generic implementation") Fixes: 40153b10d91c ("crypto: testmgr - fuzz AEADs against their generic implementation") Signed-off-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 771fd0b2e8bf7332f18c471230cf4a3db023831c Author: YueHaibing Date: Mon Dec 30 11:29:45 2019 +0800 mtd: sharpslpart: Fix unsigned comparison to zero commit f33113b542219448fa02d77ca1c6f4265bd7f130 upstream. The unsigned variable log_num is being assigned a return value from the call to sharpsl_nand_get_logical_num that can return -EINVAL. Detected using Coccinelle: ./drivers/mtd/parsers/sharpslpart.c:207:6-13: WARNING: Unsigned expression compared with zero: log_num > 0 Fixes: 8a4580e4d298 ("mtd: sharpslpart: Add sharpslpart partition parser") Signed-off-by: YueHaibing Signed-off-by: Miquel Raynal Signed-off-by: Greg Kroah-Hartman commit 1765aaef17530464ce5e02659249c8b6ea5e9f68 Author: Nathan Chancellor Date: Mon Dec 9 14:44:23 2019 -0700 mtd: onenand_base: Adjust indentation in onenand_read_ops_nolock commit 0e7ca83e82d021c928dadf4c13c137d57337540d upstream. Clang warns: ../drivers/mtd/nand/onenand/onenand_base.c:1269:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] while (!ret) { ^ ../drivers/mtd/nand/onenand/onenand_base.c:1266:2: note: previous statement is here if (column + thislen > writesize) ^ 1 warning generated. This warning occurs because there is a space before the tab of the while loop. There are spaces at the beginning of a lot of the lines in this block, remove them so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: a8de85d55700 ("[MTD] OneNAND: Implement read-while-load") Link: https://github.com/ClangBuiltLinux/linux/issues/794 Signed-off-by: Nathan Chancellor Signed-off-by: Miquel Raynal Signed-off-by: Greg Kroah-Hartman commit 7df80a021f66aea81a859aef6ffbe5cb2a502abf Author: Suzuki K Poulose Date: Mon Jan 13 23:30:23 2020 +0000 arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly commit 52f73c383b2418f2d31b798e765ae7d596c35021 upstream. We detect the absence of FP/SIMD after an incapable CPU is brought up, and by then we have kernel threads running already with TIF_FOREIGN_FPSTATE set which could be set for early userspace applications (e.g, modprobe triggered from initramfs) and init. This could cause the applications to loop forever in do_nofity_resume() as we never clear the TIF flag, once we now know that we don't support FP. Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag for tasks which may have them set, as we would have done in the normal case, but avoiding touching the hardware state (since we don't support any). Also to make sure we handle the cases seemlessly we categorise the helper functions to two : 1) Helpers for common core code, which calls into take appropriate actions without knowing the current FPSIMD state of the CPU/task. e.g fpsimd_restore_current_state(), fpsimd_flush_task_state(), fpsimd_save_and_flush_cpu_state(). We bail out early for these functions, taking any appropriate actions (e.g, clearing the TIF flag) where necessary to hide the handling from core code. 2) Helpers used when the presence of FP/SIMD is apparent. i.e, save/restore the FP/SIMD register state, modify the CPU/task FP/SIMD state. e.g, fpsimd_save(), task_fpsimd_load() - save/restore task FP/SIMD registers fpsimd_bind_task_to_cpu() \ - Update the "state" metadata for CPU/task. fpsimd_bind_state_to_cpu() / fpsimd_update_current_state() - Update the fp/simd state for the current task from memory. These must not be called in the absence of FP/SIMD. Put in a WARNING to make sure they are not invoked in the absence of FP/SIMD. KVM also uses the TIF_FOREIGN_FPSTATE flag to manage the FP/SIMD state on the CPU. However, without FP/SIMD support we trap all accesses and inject undefined instruction. Thus we should never "load" guest state. Add a sanity check to make sure this is valid. Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD") Cc: Will Deacon Cc: Mark Rutland Reviewed-by: Ard Biesheuvel Reviewed-by: Catalin Marinas Acked-by: Marc Zyngier Signed-off-by: Suzuki K Poulose Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 0ec337059d71e9a2f66f223d47a9ae428a435c41 Author: Alexandru Elisei Date: Mon Jan 27 10:36:52 2020 +0000 KVM: arm64: Treat emulated TVAL TimerValue as a signed 32-bit integer commit 4a267aa707953a9a73d1f5dc7f894dd9024a92be upstream. According to the ARM ARM, registers CNT{P,V}_TVAL_EL0 have bits [63:32] RES0 [1]. When reading the register, the value is truncated to the least significant 32 bits [2], and on writes, TimerValue is treated as a signed 32-bit integer [1, 2]. When the guest behaves correctly and writes 32-bit values, treating TVAL as an unsigned 64 bit register works as expected. However, things start to break down when the guest writes larger values, because (u64)0x1_ffff_ffff = 8589934591. but (s32)0x1_ffff_ffff = -1, and the former will cause the timer interrupt to be asserted in the future, but the latter will cause it to be asserted now. Let's treat TVAL as a signed 32-bit register on writes, to match the behaviour described in the architecture, and the behaviour experimentally exhibited by the virtual timer on a non-vhe host. [1] Arm DDI 0487E.a, section D13.8.18 [2] Arm DDI 0487E.a, section D11.2.4 Signed-off-by: Alexandru Elisei [maz: replaced the read-side mask with lower_32_bits] Signed-off-by: Marc Zyngier Fixes: 8fa761624871 ("KVM: arm/arm64: arch_timer: Fix CNTP_TVAL calculation") Link: https://lore.kernel.org/r/20200127103652.2326-1-alexandru.elisei@arm.com Signed-off-by: Greg Kroah-Hartman commit a17d21640453719e6f7a9de53d2982f0f7b2357b Author: Eric Auger Date: Fri Jan 24 15:25:34 2020 +0100 KVM: arm64: pmu: Fix chained SW_INCR counters commit aa76829171e98bd75a0cc00b6248eca269ac7f4f upstream. At the moment a SW_INCR counter always overflows on 32-bit boundary, independently on whether the n+1th counter is programmed as CHAIN. Check whether the SW_INCR counter is a 64b counter and if so, implement the 64b logic. Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters") Signed-off-by: Eric Auger Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20200124142535.29386-4-eric.auger@redhat.com Signed-off-by: Greg Kroah-Hartman commit a6229d1b5c223bdcbf031b1fd62ef751cc409d78 Author: Eric Auger Date: Fri Jan 24 15:25:32 2020 +0100 KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset commit 3837407c1aa1101ed5e214c7d6041e7a23335c6e upstream. The specification says PMSWINC increments PMEVCNTR_EL1 by 1 if PMEVCNTR_EL0 is enabled and configured to count SW_INCR. For PMEVCNTR_EL0 to be enabled, we need both PMCNTENSET to be set for the corresponding event counter but we also need the PMCR.E bit to be set. Fixes: 7a0adc7064b8 ("arm64: KVM: Add access handler for PMSWINC register") Signed-off-by: Eric Auger Signed-off-by: Marc Zyngier Reviewed-by: Andrew Murray Acked-by: Marc Zyngier Link: https://lore.kernel.org/r/20200124142535.29386-2-eric.auger@redhat.com Signed-off-by: Greg Kroah-Hartman commit 93a509cf118223f8372d49fc8711dbbec27c1c67 Author: James Morse Date: Tue Jan 21 12:33:56 2020 +0000 KVM: arm: Make inject_abt32() inject an external abort instead commit 21aecdbd7f3ab02c9b82597dc733ee759fb8b274 upstream. KVM's inject_abt64() injects an external-abort into an aarch64 guest. The KVM_CAP_ARM_INJECT_EXT_DABT is intended to do exactly this, but for an aarch32 guest inject_abt32() injects an implementation-defined exception, 'Lockdown fault'. Change this to external abort. For non-LPAE we now get the documented: | Unhandled fault: external abort on non-linefetch (0x008) at 0x9c800f00 and for LPAE: | Unhandled fault: synchronous external abort (0x210) at 0x9c800f00 Fixes: 74a64a981662a ("KVM: arm/arm64: Unify 32bit fault injection") Reported-by: Beata Michalska Signed-off-by: James Morse Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20200121123356.203000-3-james.morse@arm.com Signed-off-by: Greg Kroah-Hartman commit 9cce31930ad32bd064d938a903305c57d529976a Author: James Morse Date: Tue Jan 21 12:33:55 2020 +0000 KVM: arm: Fix DFSR setting for non-LPAE aarch32 guests commit 018f22f95e8a6c3e27188b7317ef2c70a34cb2cd upstream. Beata reports that KVM_SET_VCPU_EVENTS doesn't inject the expected exception to a non-LPAE aarch32 guest. The host intends to inject DFSR.FS=0x14 "IMPLEMENTATION DEFINED fault (Lockdown fault)", but the guest receives DFSR.FS=0x04 "Fault on instruction cache maintenance". This fault is hooked by do_translation_fault() since ARMv6, which goes on to silently 'handle' the exception, and restart the faulting instruction. It turns out, when TTBCR.EAE is clear DFSR is split, and FS[4] has to shuffle up to DFSR[10]. As KVM only does this in one place, fix up the static values. We now get the expected: | Unhandled fault: lock abort (0x404) at 0x9c800f00 Fixes: 74a64a981662a ("KVM: arm/arm64: Unify 32bit fault injection") Reported-by: Beata Michalska Signed-off-by: James Morse Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20200121123356.203000-2-james.morse@arm.com Signed-off-by: Greg Kroah-Hartman commit 48f9ec2020b31abf92544b01d2719a2b18b5df7e Author: Gavin Shan Date: Tue Jan 21 16:56:59 2020 +1100 KVM: arm/arm64: Fix young bit from mmu notifier commit cf2d23e0bac9f6b5cd1cba8898f5f05ead40e530 upstream. kvm_test_age_hva() is called upon mmu_notifier_test_young(), but wrong address range has been passed to handle_hva_to_gpa(). With the wrong address range, no young bits will be checked in handle_hva_to_gpa(). It means zero is always returned from mmu_notifier_test_young(). This fixes the issue by passing correct address range to the underly function handle_hva_to_gpa(), so that the hardware young (access) bit will be visited. Fixes: 35307b9a5f7e ("arm/arm64: KVM: Implement Stage-2 page aging") Signed-off-by: Gavin Shan Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20200121055659.19560-1-gshan@redhat.com Signed-off-by: Greg Kroah-Hartman commit 537493f1460a27f81eb1499d62ec974c6103690e Author: Suzuki K Poulose Date: Mon Jan 13 23:30:21 2020 +0000 arm64: ptrace: nofpsimd: Fail FP/SIMD regset operations commit c9d66999f064947e6b577ceacc1eb2fbca6a8d3c upstream. When fp/simd is not supported on the system, fail the operations of FP/SIMD regsets. Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD") Cc: Will Deacon Cc: Mark Rutland Reviewed-by: Ard Biesheuvel Reviewed-by: Catalin Marinas Signed-off-by: Suzuki K Poulose Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 37014cee458cd4a5b8d849f6ad0af82cc6f2c3e0 Author: Suzuki K Poulose Date: Mon Jan 13 23:30:20 2020 +0000 arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly commit 7559950aef1ab8792c50797c6c5c7c5150a02460 upstream. We set the compat_elf_hwcap bits unconditionally on arm64 to include the VFP and NEON support. However, the FP/SIMD unit is optional on Arm v8 and thus could be missing. We already handle this properly in the kernel, but still advertise to the COMPAT applications that the VFP is available. Fix this to make sure we only advertise when we really have them. Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD") Cc: Will Deacon Cc: Mark Rutland Reviewed-by: Ard Biesheuvel Reviewed-by: Catalin Marinas Signed-off-by: Suzuki K Poulose Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 1a2b07a2c3dde4e2ce6398a7a7187b49489493ad Author: Suzuki K Poulose Date: Mon Jan 13 23:30:19 2020 +0000 arm64: cpufeature: Fix the type of no FP/SIMD capability commit 449443c03d8cfdacf7313e17779a2594ebf87e6d upstream. The NO_FPSIMD capability is defined with scope SYSTEM, which implies that the "absence" of FP/SIMD on at least one CPU is detected only after all the SMP CPUs are brought up. However, we use the status of this capability for every context switch. So, let us change the scope to LOCAL_CPU to allow the detection of this capability as and when the first CPU without FP is brought up. Also, the current type allows hotplugged CPU to be brought up without FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace up. Fix both of these issues by changing the capability to BOOT_RESTRICTED_LOCAL_CPU_FEATURE. Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD") Cc: Will Deacon Cc: Mark Rutland Reviewed-by: Ard Biesheuvel Reviewed-by: Catalin Marinas Signed-off-by: Suzuki K Poulose Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit ba95651cefe1d1c4a262f8e78548812cd6ab073c Author: Qais Yousef Date: Tue Dec 24 11:54:04 2019 +0000 sched/uclamp: Fix a bug in propagating uclamp value in new cgroups commit 7226017ad37a888915628e59a84a2d1e57b40707 upstream. When a new cgroup is created, the effective uclamp value wasn't updated with a call to cpu_util_update_eff() that looks at the hierarchy and update to the most restrictive values. Fix it by ensuring to call cpu_util_update_eff() when a new cgroup becomes online. Without this change, the newly created cgroup uses the default root_task_group uclamp values, which is 1024 for both uclamp_{min, max}, which will cause the rq to to be clamped to max, hence cause the system to run at max frequency. The problem was observed on Ubuntu server and was reproduced on Debian and Buildroot rootfs. By default, Ubuntu and Debian create a cpu controller cgroup hierarchy and add all tasks to it - which creates enough noise to keep the rq uclamp value at max most of the time. Imitating this behavior makes the problem visible in Buildroot too which otherwise looks fine since it's a minimal userspace. Fixes: 0b60ba2dd342 ("sched/uclamp: Propagate parent clamps") Reported-by: Doug Smythies Signed-off-by: Qais Yousef Signed-off-by: Peter Zijlstra (Intel) Tested-by: Doug Smythies Link: https://lore.kernel.org/lkml/000701d5b965$361b6c60$a2524520$@net/ Signed-off-by: Greg Kroah-Hartman commit 5d42957c9045d90bca865f534b6114551034c944 Author: Olof Johansson Date: Wed Dec 18 01:18:49 2019 +0100 ARM: 8949/1: mm: mark free_memmap as __init commit 31f3010e60522ede237fb145a63b4af5a41718c2 upstream. As of commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"), free_memmap() might not always be inlined, and thus is triggering a section warning: WARNING: vmlinux.o(.text.unlikely+0x904): Section mismatch in reference from the function free_memmap() to the function .meminit.text:memblock_free() Mark it as __init, since the faller (free_unused_memmap) already is. Fixes: ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") Signed-off-by: Olof Johansson Signed-off-by: Russell King Signed-off-by: Greg Kroah-Hartman commit 199808393ef710f880305b259b1767278da425ba Author: Eric Auger Date: Fri Dec 13 10:42:37 2019 +0100 KVM: arm/arm64: vgic-its: Fix restoration of unmapped collections commit 8c58be34494b7f1b2adb446e2d8beeb90e5de65b upstream. Saving/restoring an unmapped collection is a valid scenario. For example this happens if a MAPTI command was sent, featuring an unmapped collection. At the moment the CTE fails to be restored. Only compare against the number of online vcpus if the rdist base is set. Fixes: ea1ad53e1e31a ("KVM: arm64: vgic-its: Collection table save/restore") Signed-off-by: Eric Auger Signed-off-by: Marc Zyngier Reviewed-by: Zenghui Yu Link: https://lore.kernel.org/r/20191213094237.19627-1-eric.auger@redhat.com Signed-off-by: Greg Kroah-Hartman commit c406e5352155d0616e233621b0c87247c20d9370 Author: Claudiu Beznea Date: Wed Dec 11 13:04:08 2019 +0200 ARM: at91: pm: use of_device_id array to find the proper shdwc node commit ec6e618c8c018c1361d77789a100a5f6f6317178 upstream. Use of_device_id array to find the proper shdwc compatibile node. SAM9X60's shdwc changes were not integrated when commit eaedc0d379da ("ARM: at91: pm: add ULP1 support for SAM9X60") was integrated. Fixes: eaedc0d379da ("ARM: at91: pm: add ULP1 support for SAM9X60") Signed-off-by: Claudiu Beznea Link: https://lore.kernel.org/r/1576062248-18514-3-git-send-email-claudiu.beznea@microchip.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 69f08f44b4c1292621c91c568dfd3cd8f1f3b7fd Author: Claudiu Beznea Date: Wed Dec 11 13:04:07 2019 +0200 ARM: at91: pm: use SAM9X60 PMC's compatible commit 6b9dfd986a81a999a27b6ed9dbe91203089c62dd upstream. SAM9X60 PMC's has a different PMC. It was not integrated at the moment commit 01c7031cfa73 ("ARM: at91: pm: initial PM support for SAM9X60") was published. Fixes: 01c7031cfa73 ("ARM: at91: pm: initial PM support for SAM9X60") Signed-off-by: Claudiu Beznea Link: https://lore.kernel.org/r/1576062248-18514-2-git-send-email-claudiu.beznea@microchip.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 451b91d88a1d685d42933cc6c22c8592c1974a1b Author: Shameer Kolothum Date: Wed Nov 13 16:11:38 2019 +0000 iommu/arm-smmu-v3: Populate VMID field for CMDQ_OP_TLBI_NH_VA commit 935d43ba272e0001f8ef446a3eff15d8175cb11b upstream. CMDQ_OP_TLBI_NH_VA requires VMID and this was missing since commit 1c27df1c0a82 ("iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA"). Add it back. Fixes: 1c27df1c0a82 ("iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA") Signed-off-by: Shameer Kolothum Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit c4faf627c76e7c8cc7eef5f33b0aed212d314041 Author: Alexey Kardashevskiy Date: Mon Dec 16 15:19:22 2019 +1100 powerpc/pseries: Allow not having ibm, hypertas-functions::hcall-multi-tce for DDW commit 7559d3d295f3365ea7ac0c0274c05e633fe4f594 upstream. By default a pseries guest supports a H_PUT_TCE hypercall which maps a single IOMMU page in a DMA window. Additionally the hypervisor may support H_PUT_TCE_INDIRECT/H_STUFF_TCE which update multiple TCEs at once; this is advertised via the device tree /rtas/ibm,hypertas-functions property which Linux converts to FW_FEATURE_MULTITCE. FW_FEATURE_MULTITCE is checked when dma_iommu_ops is used; however the code managing the huge DMA window (DDW) ignores it and calls H_PUT_TCE_INDIRECT even if it is explicitly disabled via the "multitce=off" kernel command line parameter. This adds FW_FEATURE_MULTITCE checking to the DDW code path. This changes tce_build_pSeriesLP to take liobn and page size as the huge window does not have iommu_table descriptor which usually the place to store these numbers. Fixes: 4e8b0cf46b25 ("powerpc/pseries: Add support for dynamic dma windows") Signed-off-by: Alexey Kardashevskiy Reviewed-by: Thiago Jung Bauermann Tested-by: Thiago Jung Bauermann Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20191216041924.42318-3-aik@ozlabs.ru Signed-off-by: Greg Kroah-Hartman commit cff30edec932d21b3f20af7379148adade442124 Author: Tyrel Datwyler Date: Mon Jan 20 14:10:02 2020 -0800 powerpc/pseries/vio: Fix iommu_table use-after-free refcount warning commit aff8c8242bc638ba57247ae1ec5f272ac3ed3b92 upstream. Commit e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to iommu_table") missed an iommu_table allocation in the pseries vio code. The iommu_table is allocated with kzalloc and as a result the associated kref gets a value of zero. This has the side effect that during a DLPAR remove of the associated virtual IOA the iommu_tce_table_put() triggers a use-after-free underflow warning. Call Trace: [c0000002879e39f0] [c00000000071ecb4] refcount_warn_saturate+0x184/0x190 (unreliable) [c0000002879e3a50] [c0000000000500ac] iommu_tce_table_put+0x9c/0xb0 [c0000002879e3a70] [c0000000000f54e4] vio_dev_release+0x34/0x70 [c0000002879e3aa0] [c00000000087cfa4] device_release+0x54/0xf0 [c0000002879e3b10] [c000000000d64c84] kobject_cleanup+0xa4/0x240 [c0000002879e3b90] [c00000000087d358] put_device+0x28/0x40 [c0000002879e3bb0] [c0000000007a328c] dlpar_remove_slot+0x15c/0x250 [c0000002879e3c50] [c0000000007a348c] remove_slot_store+0xac/0xf0 [c0000002879e3cd0] [c000000000d64220] kobj_attr_store+0x30/0x60 [c0000002879e3cf0] [c0000000004ff13c] sysfs_kf_write+0x6c/0xa0 [c0000002879e3d10] [c0000000004fde4c] kernfs_fop_write+0x18c/0x260 [c0000002879e3d60] [c000000000410f3c] __vfs_write+0x3c/0x70 [c0000002879e3d80] [c000000000415408] vfs_write+0xc8/0x250 [c0000002879e3dd0] [c0000000004157dc] ksys_write+0x7c/0x120 [c0000002879e3e20] [c00000000000b278] system_call+0x5c/0x68 Further, since the refcount was always zero the iommu_tce_table_put() fails to call the iommu_table release function resulting in a leak. Fix this issue be initilizing the iommu_table kref immediately after allocation. Fixes: e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to iommu_table") Signed-off-by: Tyrel Datwyler Reviewed-by: Alexey Kardashevskiy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/1579558202-26052-1-git-send-email-tyreld@linux.ibm.com Signed-off-by: Greg Kroah-Hartman commit 5ca556d5edfd66cd89da7e2b26dfbd4858735e71 Author: Vaibhav Jain Date: Wed Jan 22 21:21:40 2020 +0530 powerpc/papr_scm: Fix leaking 'bus_desc.provider_name' in some paths commit 5649607a8d0b0e019a4db14aab3de1e16c3a2b4f upstream. String 'bus_desc.provider_name' allocated inside papr_scm_nvdimm_init() will leaks in case call to nvdimm_bus_register() fails or when papr_scm_remove() is called. This minor patch ensures that 'bus_desc.provider_name' is freed in error path for nvdimm_bus_register() as well as in papr_scm_remove(). Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Vaibhav Jain Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200122155140.120429-1-vaibhav@linux.ibm.com Signed-off-by: Greg Kroah-Hartman commit 05a23f436471816b6a22afd0511d3747972a564b Author: Christophe Leroy Date: Tue Jan 14 08:13:10 2020 +0000 powerpc/ptdump: Only enable PPC_CHECK_WX with STRICT_KERNEL_RWX commit f509247b08f2dcf7754d9ed85ad69a7972aa132b upstream. ptdump_check_wx() is called from mark_rodata_ro() which only exists when CONFIG_STRICT_KERNEL_RWX is selected. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/922d4939c735c6b52b4137838bcc066fffd4fc33.1578989545.git.christophe.leroy@c-s.fr Signed-off-by: Greg Kroah-Hartman commit 17f37249d7db610163e3ab87a8e6e8dd6dbd6112 Author: Christophe Leroy Date: Tue Jan 14 08:13:08 2020 +0000 powerpc/ptdump: Fix W+X verification call in mark_rodata_ro() commit e26ad936dd89d79f66c2b567f700e0c2a7103070 upstream. ptdump_check_wx() also have to be called when pages are mapped by blocks. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/37517da8310f4457f28921a4edb88fb21d27b62a.1578989531.git.christophe.leroy@c-s.fr Signed-off-by: Greg Kroah-Hartman commit 6d7edac1469eddb98d66947f3cdb726bc5e695d2 Author: Ram Pai Date: Mon Dec 16 15:19:21 2019 +1100 Revert "powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests" commit d862b44133b7a1d7de25288e09eabf4df415e971 upstream. This reverts commit edea902c1c1efb855f77e041f9daf1abe7a9768a. At the time the change allowed direct DMA ops for secure VMs; however since then we switched on using SWIOTLB backed with IOMMU (direct mapping) and to make this work, we need dma_iommu_ops which handles all cases including TCE mapping I/O pages in the presence of an IOMMU. Fixes: edea902c1c1e ("powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests") Signed-off-by: Ram Pai [aik: added "revert" and "fixes:"] Signed-off-by: Alexey Kardashevskiy Reviewed-by: Thiago Jung Bauermann Tested-by: Thiago Jung Bauermann Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20191216041924.42318-2-aik@ozlabs.ru Signed-off-by: Greg Kroah-Hartman commit 45c764da005af27caad9789e8e013fcaa6126ade Author: Douglas Anderson Date: Thu Feb 14 09:36:33 2019 -0800 soc: qcom: rpmhpd: Set 'active_only' for active only power domains commit 5d0d4d42bed0090d3139e7c5ca1587d76d48add6 upstream. The 'active_only' attribute was accidentally never set to true for any power domains meaning that all the code handling this attribute was dead. NOTE that the RPM power domain code (as opposed to the RPMh one) gets this right. Acked-by: Rajendra Nayak Reviewed-by: Stephen Boyd Fixes: 279b7e8a62cc ("soc: qcom: rpmhpd: Add RPMh power domain driver") Signed-off-by: Douglas Anderson Link: https://lore.kernel.org/r/20190214173633.211000-1-dianders@chromium.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 0bf48acf43385315b4495a17fe2eb92af7a8fc1c Author: Zhengyuan Liu Date: Sat Dec 14 00:27:12 2019 +0800 tools/power/acpi: fix compilation error commit 1985f8c7f9a42a651a9750d6fcadc74336d182df upstream. If we compile tools/acpi target in the top source directory, we'd get a compilation error showing as bellow: # make tools/acpi DESCEND power/acpi DESCEND tools/acpidbg CC tools/acpidbg/acpidbg.o Assembler messages: Fatal error: can't create /home/lzy/kernel-upstream/power/acpi/\ tools/acpidbg/acpidbg.o: No such file or directory ../../Makefile.rules:26: recipe for target '/home/lzy/kernel-upstream/\ power/acpi/tools/acpidbg/acpidbg.o' failed make[3]: *** [/home/lzy/kernel-upstream//power/acpi/tools/acpidbg/\ acpidbg.o] Error 1 Makefile:19: recipe for target 'acpidbg' failed make[2]: *** [acpidbg] Error 2 Makefile:54: recipe for target 'acpi' failed make[1]: *** [acpi] Error 2 Makefile:1607: recipe for target 'tools/acpi' failed make: *** [tools/acpi] Error 2 Fixes: d5a4b1a540b8 ("tools/power/acpi: Remove direct kernel source include reference") Signed-off-by: Zhengyuan Liu Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 5d3453a5660b0872e0e2dfe26aaa29b9ffbe49b1 Author: Alexandre Belloni Date: Fri Jan 10 18:20:07 2020 +0100 ARM: dts: at91: sama5d3: define clock rate range for tcb1 commit a7e0f3fc01df4b1b7077df777c37feae8c9e8b6d upstream. The clock rate range for the TCB1 clock is missing. define it in the device tree. Reported-by: Karl Rudbæk Olsen Fixes: d2e8190b7916 ("ARM: at91/dt: define sama5d3 clocks") Link: https://lore.kernel.org/r/20200110172007.1253659-2-alexandre.belloni@bootlin.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 581a5fbf4f6df71757f34d5f36dff113ddaf0af2 Author: Alexandre Belloni Date: Fri Jan 10 18:20:06 2020 +0100 ARM: dts: at91: sama5d3: fix maximum peripheral clock rates commit ee0aa926ddb0bd8ba59e33e3803b3b5804e3f5da upstream. Currently the maximum rate for peripheral clock is calculated based on a typical 133MHz MCK. The maximum frequency is defined in the datasheet as a ratio to MCK. Some sama5d3 platforms are using a 166MHz MCK. Update the device trees to match the maximum rate based on 166MHz. Reported-by: Karl Rudbæk Olsen Fixes: d2e8190b7916 ("ARM: at91/dt: define sama5d3 clocks") Link: https://lore.kernel.org/r/20200110172007.1253659-1-alexandre.belloni@bootlin.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 8e8802c935097aa2ed2063d288c6646896237d16 Author: Martin Blumenstingl Date: Wed Dec 25 02:06:07 2019 +0100 ARM: dts: meson8b: use the actual frequency for the GPU's 364MHz OPP commit c3dd3315ab58b2cfa1916df55b0d0f9fbd94266f upstream. The clock setup on Meson8 cannot achieve a Mali frequency of exactly 182.15MHz. The vendor driver uses "FCLK_DIV7 / 1" for this frequency, which translates to 2550MHz / 7 / 1 = 364285714Hz. Update the GPU operating point to that specific frequency to not confuse myself when comparing the frequency from the .dts with the actual clock rate on the system. Fixes: c3ea80b6138cae ("ARM: dts: meson8b: add the Mali-450 MP2 GPU") Signed-off-by: Martin Blumenstingl Signed-off-by: Kevin Hilman Signed-off-by: Greg Kroah-Hartman commit 16665fffafee50b638b73eacff80f930a0a2a617 Author: Martin Blumenstingl Date: Wed Dec 25 02:06:06 2019 +0100 ARM: dts: meson8: use the actual frequency for the GPU's 182.1MHz OPP commit fe634a7a9a57fb736e39fb71aa9adc6448a90f94 upstream. The clock setup on Meson8 cannot achieve a Mali frequency of exactly 182.15MHz. The vendor driver uses "FCLK_DIV7 / 2" for this frequency, which translates to 2550MHz / 7 / 2 = 182142857Hz. Update the GPU operating point to that specific frequency to not confuse myself when comparing the frequency from the .dts with the actual clock rate on the system. Fixes: 7d3f6b536e72c9 ("ARM: dts: meson8: add the Mali-450 MP6 GPU") Signed-off-by: Martin Blumenstingl Signed-off-by: Kevin Hilman Signed-off-by: Greg Kroah-Hartman commit 3d2d8cd29c1b76686ea18cad233276fa83f253cf Author: Baruch Siach Date: Thu Dec 19 12:28:45 2019 +0200 arm64: dts: marvell: clearfog-gt-8k: fix switch cpu port node commit 62bba54d99407aedfe9b0a02e72e23c06e2b0116 upstream. Explicitly set the switch cpu (upstream) port phy-mode and managed properties. This fixes the Marvell 88E6141 switch serdes configuration with the recently enabled phylink layer. Fixes: a6120833272c ("arm64: dts: add support for SolidRun Clearfog GT 8K") Reported-by: Denis Odintsov Signed-off-by: Baruch Siach Reviewed-by: Andrew Lunn Signed-off-by: Gregory CLEMENT Signed-off-by: Greg Kroah-Hartman commit c44134c2a287ed11c00d95fcc57ac4c47db1f493 Author: Kuninori Morimoto Date: Mon Dec 16 11:08:22 2019 +0900 arm64: dts: renesas: r8a77990: ebisu: Remove clkout-lr-synchronous from sound commit bf2b74ce9b33a2edd6ba1930ce60a71830790910 upstream. rcar_sound doesn't support clkout-lr-synchronous in upstream. It was supported under out-of-tree rcar_sound. upstream rcar_sound is supporting - clkout-lr-synchronous + clkout-lr-asynchronous Signed-off-by: Kuninori Morimoto Link: https://lore.kernel.org/r/87mubt3tux.wl-kuninori.morimoto.gx@renesas.com Fixes: 56629fcba94c698d ("arm64: dts: renesas: ebisu: Enable Audio") Signed-off-by: Geert Uytterhoeven Signed-off-by: Greg Kroah-Hartman commit 6f9da85057c1b4a6291c82a6db0d40e0865fa66a Author: Tero Kristo Date: Wed Dec 11 08:07:18 2019 -0600 ARM: dts: am43xx: add support for clkout1 clock commit 01053dadb79d63b65f7b353e68b4b6ccf4effedb upstream. clkout1 clock node and its generation tree was missing. Add this based on the data on TRM and PRCM functional spec. commit 664ae1ab2536 ("ARM: dts: am43xx: add clkctrl nodes") effectively reverted this commit 8010f13a40d3 ("ARM: dts: am43xx: add support for clkout1 clock") which is needed for the ov2659 camera sensor clock definition hence it is being re-applied here. Note that because of the current dts node name dependency for mapping to clock domain, we must still use "clkout1-*ck" naming instead of generic "clock@" naming for the node. And because of this, it's probably best to apply the dts node addition together along with the other clock changes. Fixes: 664ae1ab2536 ("ARM: dts: am43xx: add clkctrl nodes") Signed-off-by: Tero Kristo Tested-by: Benoit Parrot Acked-by: Tony Lindgren Signed-off-by: Benoit Parrot Signed-off-by: Tony Lindgren Signed-off-by: Greg Kroah-Hartman commit bd13285419b9bb3a254c9bdd6382640266a39665 Author: Ingo van Lil Date: Tue Dec 3 15:21:47 2019 +0100 ARM: dts: at91: Reenable UART TX pull-ups commit 9d39d86cd4af2b17b970d63307daad71f563d207 upstream. Pull-ups for SAM9 UART/USART TX lines were disabled in a previous commit. However, several chips in the SAM9 family require pull-ups to prevent the TX lines from falling (and causing an endless break condition) when the transceiver is disabled. From the SAM9G20 datasheet, 32.5.1: "To prevent the TXD line from falling when the USART is disabled, the use of an internal pull up is mandatory.". This commit reenables the pull-ups for all chips having that sentence in their datasheets. Fixes: 5e04822f7db5 ("ARM: dts: at91: fixes uart pinctrl, set pullup on rx, clear pullup on tx") Signed-off-by: Ingo van Lil Cc: Peter Rosin Link: https://lore.kernel.org/r/20191203142147.875227-1-inguin@gmx.de Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit d01521db86ac5a51a2a3d66a8a73571cca12bd97 Author: Russell King Date: Sat Nov 16 11:06:56 2019 +0000 arm64: dts: uDPU: fix broken ethernet commit 1eebac0240580b531954b02c05068051df41142a upstream. The uDPU uses both ethernet controllers, which ties up COMPHY 0 for eth1 and COMPHY 1 for eth0, with no USB3 comphy. The addition of COMPHY support made the kernel override the setup by the boot loader breaking this platform by assuming that COMPHY 0 was always used for USB3. Delete the USB3 COMPHY definition at platform level, and add phy specifications for the ethernet channels. Fixes: bd3d25b07342 ("arm64: dts: marvell: armada-37xx: link USB hosts with their PHYs") Signed-off-by: Russell King Signed-off-by: Gregory CLEMENT Signed-off-by: Greg Kroah-Hartman commit 5ee40005f420dafb7a0c584700c7c039d635b50b Author: Jeffrey Hugo Date: Wed Nov 6 20:59:48 2019 -0800 arm64: dts: qcom: msm8998: Fix tcsr syscon size commit 05caa5bf9cab9983dd7a50428c46b7e617ba20d6 upstream. The tcsr syscon region is really 0x40000 in size. We need access to the full region so that we can access the axi resets when managing the modem subsystem. Fixes: c7833949564e ("arm64: dts: qcom: msm8998: Add smem related nodes") Signed-off-by: Jeffrey Hugo Link: https://lore.kernel.org/r/20191107045948.4341-1-jeffrey.l.hugo@gmail.com Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit bc684844e7e09d384ee838a80c894a1735bb0645 Author: Mika Westerberg Date: Wed Jan 22 19:05:20 2020 +0300 platform/x86: intel_mid_powerbtn: Take a copy of ddata commit 5e0c94d3aeeecc68c573033f08d9678fecf253bd upstream. The driver gets driver_data from memory that is marked as const (which is probably put to read-only memory) and it then modifies it. This likely causes some sort of fault to happen. Fix this by taking a copy of the structure. Fixes: c94a8ff14de3 ("platform/x86: intel_mid_powerbtn: make mid_pb_ddata const") Signed-off-by: Mika Westerberg Reviewed-by: Andy Shevchenko Signed-off-by: Andy Shevchenko Signed-off-by: Greg Kroah-Hartman commit b09e3d3e79cf9c0cb11768eb60a27e35720d7e26 Author: Jose Abreu Date: Tue Jan 14 17:09:24 2020 +0100 ARC: [plat-axs10x]: Add missing multicast filter number to GMAC node commit 7980dff398f86a618f502378fa27cf7e77449afa upstream. Add a missing property to GMAC node so that multicast filtering works correctly. Fixes: 556cc1c5f528 ("ARC: [axs101] Add support for AXS101 SDP (software development platform)") Acked-by: Alexey Brodkin Signed-off-by: Jose Abreu Signed-off-by: Vineet Gupta Signed-off-by: Greg Kroah-Hartman commit 2de1af2bcba06b97583121d9739ba9e5f5d89f49 Author: Sai Prakash Ranjan Date: Fri Dec 13 12:19:34 2019 +0530 watchdog: qcom: Use platform_get_irq_optional() for bark irq commit e0b4f4e0cf7fa9d62628d4249c765ec18dffd143 upstream. platform_get_irq() prints an error message when the interrupt is not available. So on platforms where bark interrupt is not specified, following error message is observed on SDM845. [ 2.975888] qcom_wdt 17980000.watchdog: IRQ index 0 not found This is also seen on SC7180, SM8150 SoCs as well. Fix this by using platform_get_irq_optional() instead. Fixes: 36375491a4395654 ("watchdog: qcom: support pre-timeout when the bark irq is available") Signed-off-by: Sai Prakash Ranjan Reviewed-by: Bjorn Andersson Reviewed-by: Guenter Roeck Reviewed-by: Stephen Boyd Link: https://lore.kernel.org/r/20191213064934.4112-1-saiprakash.ranjan@codeaurora.org Signed-off-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck Signed-off-by: Greg Kroah-Hartman commit f599ae7529e63ce597e21155c116e78926f3b7e3 Author: Andy Shevchenko Date: Thu Jan 23 15:14:35 2020 +0200 rtc: cmos: Stop using shared IRQ commit b6da197a2e9670df6f07e6698629e9ce95ab614e upstream. As reported by Guilherme G. Piccoli: ---8<---8<---8<--- The rtc-cmos interrupt setting was changed in the commit 079062b28fb4 ("rtc: cmos: prevent kernel warning on IRQ flags mismatch") in order to allow shared interrupts; according to that commit's description, some machine got kernel warnings due to the interrupt line being shared between rtc-cmos and other hardware, and rtc-cmos didn't allow IRQ sharing that time. After the aforementioned commit though it was observed a huge increase in lost HPET interrupts in some systems, observed through the following kernel message: [...] hpet1: lost 35 rtc interrupts After investigation, it was narrowed down to the shared interrupts usage when having the kernel option "irqpoll" enabled. In this case, all IRQ handlers are called for non-timer interrupts, if such handlers are setup in shared IRQ lines. The rtc-cmos IRQ handler could be set to hpet_rtc_interrupt(), which will produce the kernel "lost interrupts" message after doing work - lots of readl/writel to HPET registers, which are known to be slow. Although "irqpoll" is not a default kernel option, it's used in some contexts, one being the kdump kernel (which is an already "impaired" kernel usually running with 1 CPU available), so the performance burden could be considerable. Also, the same issue would happen (in a shorter extent though) when using "irqfixup" kernel option. In a quick experiment, a virtual machine with uptime of 2 minutes produced >300 calls to hpet_rtc_interrupt() when "irqpoll" was set, whereas without sharing interrupts this number reduced to 1 interrupt. Machines with more hardware than a VM should generate even more unnecessary HPET interrupts in this scenario. ---8<---8<---8<--- After looking into the rtc-cmos driver history and DSDT table from the Microsoft Surface 3, we may notice that Hans de Goede submitted a correct fix (see dependency below). Thus, we simply revert the culprit commit. Fixes: 079062b28fb4 ("rtc: cmos: prevent kernel warning on IRQ flags mismatch") Depends-on: a1e23a42f1bd ("rtc: cmos: Do not assume irq 8 for rtc when there are no legacy irqs") Reported-by: Guilherme G. Piccoli Cc: Hans de Goede Signed-off-by: Andy Shevchenko Tested-by: Guilherme G. Piccoli Reviewed-by: Hans de Goede Link: https://lore.kernel.org/r/20200123131437.28157-1-andriy.shevchenko@linux.intel.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 7a3aa58c0e027b5d7997bee1a64860ddc6059942 Author: Paul Kocialkowski Date: Thu Dec 12 16:31:10 2019 +0100 rtc: hym8563: Return -EINVAL if the time is known to be invalid commit f236a2a2ebabad0848ad0995af7ad1dc7029e895 upstream. The current code returns -EPERM when the voltage loss bit is set. Since the bit indicates that the time value is not valid, return -EINVAL instead, which is the appropriate error code for this situation. Fixes: dcaf03849352 ("rtc: add hym8563 rtc-driver") Signed-off-by: Paul Kocialkowski Link: https://lore.kernel.org/r/20191212153111.966923-1-paul.kocialkowski@bootlin.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit ffad5982ce5828b80316673c60f153abf8cfa11b Author: Steven Clarkson Date: Thu Jan 30 16:48:16 2020 -0800 x86/boot: Handle malformed SRAT tables during early ACPI parsing [ Upstream commit 2b73ea3796242608b4ccf019ff217156c92e92fe ] Break an infinite loop when early parsing of the SRAT table is caused by a subtable with zero length. Known to affect the ASUS WS X299 SAGE motherboard with firmware version 1201 which has a large block of zeros in its SRAT table. The kernel could boot successfully on this board/firmware prior to the introduction of early parsing this table or after a BIOS update. [ bp: Fixup whitespace damage and commit message. Make it return 0 to denote that there are no immovable regions because who knows what else is broken in this BIOS. ] Fixes: 02a3e3cdb7f1 ("x86/boot: Parse SRAT table and count immovable memory regions") Signed-off-by: Steven Clarkson Signed-off-by: Borislav Petkov Cc: linux-acpi@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=206343 Link: https://lkml.kernel.org/r/CAHKq8taGzj0u1E_i=poHUam60Bko5BpiJ9jn0fAupFUYexvdUQ@mail.gmail.com Signed-off-by: Sasha Levin commit f4818129947c87e5ce62b468a4a2516bb38f7f24 Author: Robert Milkowski Date: Thu Jan 30 09:43:25 2020 +0000 NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals commit 7dc2993a9e51dd2eee955944efec65bef90265b7 upstream. Currently, each time nfs4_do_fsinfo() is called it will do an implicit NFS4 lease renewal, which is not compliant with the NFS4 specification. This can result in a lease being expired by an NFS server. Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases") introduced implicit client lease renewal in nfs4_do_fsinfo(), which can result in the NFSv4.0 lease to expire on a server side, and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID. This can easily be reproduced by frequently unmounting a sub-mount, then stat'ing it to get it mounted again, which will delay or even completely prevent client from sending RENEW operations if no other NFS operations are issued. Eventually nfs server will expire client's lease and return an error on file access or next RENEW. This can also happen when a sub-mount is automatically unmounted due to inactivity (after nfs_mountpoint_expiry_timeout), then it is mounted again via stat(). This can result in a short window during which client's lease will expire on a server but not on a client. This specific case was observed on production systems. This patch removes the implicit lease renewal from nfs4_do_fsinfo(). Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases") Signed-off-by: Robert Milkowski Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit cf360732f811fd402d8c278124e8a6a9e43a09bf Author: Robert Milkowski Date: Tue Jan 28 08:37:47 2020 +0000 NFSv4: try lease recovery on NFS4ERR_EXPIRED commit 924491f2e476f7234d722b24171a4daff61bbe13 upstream. Currently, if an nfs server returns NFS4ERR_EXPIRED to open(), we return EIO to applications without even trying to recover. Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid") Signed-off-by: Robert Milkowski Reviewed-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit 5d0a6d77b69c31554158fd4eae867aaab0d3c102 Author: Trond Myklebust Date: Sun Jan 26 17:31:13 2020 -0500 NFSv4: pnfs_roc() must use cred_fscmp() to compare creds commit 387122478775be5d9816c34aa29de53d0b926835 upstream. When comparing two 'struct cred' for equality w.r.t. behaviour under filesystem access, we need to use cred_fscmp(). Fixes: a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit 86065de0eb0aa42009e73664c063536e84651c91 Author: Trond Myklebust Date: Mon Jan 6 15:25:06 2020 -0500 NFS: Fix fix of show_nfs_errors commit 118b6292195cfb86a9f43cb65610fc6d980c65f4 upstream. Casting a negative value to an unsigned long is not the same as converting it to its absolute value. Fixes: 96650e2effa2 ("NFS: Fix show_nfs_errors macros again") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit 5d7030939d22cd749d14df657eed174f0958699b Author: Trond Myklebust Date: Mon Jan 6 15:25:04 2020 -0500 NFS/pnfs: Fix pnfs_generic_prepare_to_resend_writes() commit 221203ce6406273cf00e5c6397257d986c003ee6 upstream. Instead of making assumptions about the commit verifier contents, change the commit code to ensure we always check that the verifier was set by the XDR code. Fixes: f54bcf2ecee9 ("pnfs: Prepare for flexfiles by pulling out common code") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit 1ef47a06d2d4b7b16e5ef0f18754040da50f1f36 Author: Trond Myklebust Date: Mon Jan 6 15:25:00 2020 -0500 NFS: Revalidate the file size on a fatal write error commit 0df68ced55443243951d02cc497be31fadf28173 upstream. If we suffer a fatal error upon writing a file, which causes us to need to revalidate the entire mapping, then we should also revalidate the file size. Fixes: d2ceb7e57086 ("NFS: Don't use page_file_mapping after removing the page") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit b7560b5b72a2f537f8e1fb17a5021cfee089b247 Author: Geert Uytterhoeven Date: Mon Dec 30 16:32:38 2019 +0100 nfs: NFS_SWAP should depend on SWAP commit 474c4f306eefbb21b67ebd1de802d005c7d7ecdc upstream. If CONFIG_SWAP=n, it does not make much sense to offer the user the option to enable support for swapping over NFS, as that will still fail at run time: # swapon /swap swapon: /swap: swapon failed: Function not implemented Fix this by adding a dependency on CONFIG_SWAP. Fixes: a564b8f0398636ba ("nfs: enable swap on NFS") Signed-off-by: Geert Uytterhoeven Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit 7842c7b30d7586046593b5790a1c5ddd7de63c5e Author: Lorenz Bauer Date: Fri Feb 7 10:37:12 2020 +0000 bpf, sockmap: Check update requirements after locking commit 85b8ac01a421791d66c3a458a7f83cfd173fe3fa upstream. It's currently possible to insert sockets in unexpected states into a sockmap, due to a TOCTTOU when updating the map from a syscall. sock_map_update_elem checks that sk->sk_state == TCP_ESTABLISHED, locks the socket and then calls sock_map_update_common. At this point, the socket may have transitioned into another state, and the earlier assumptions don't hold anymore. Crucially, it's conceivable (though very unlikely) that a socket has become unhashed. This breaks the sockmap's assumption that it will get a callback via sk->sk_prot->unhash. Fix this by checking the (fixed) sk_type and sk_protocol without the lock, followed by a locked check of sk_state. Unfortunately it's not possible to push the check down into sock_(map|hash)_update_common, since BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB run before the socket has transitioned from TCP_SYN_RECV into TCP_ESTABLISHED. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Lorenz Bauer Signed-off-by: Daniel Borkmann Reviewed-by: Jakub Sitnicki Link: https://lore.kernel.org/bpf/20200207103713.28175-1-lmb@cloudflare.com Signed-off-by: Greg Kroah-Hartman commit 45d7b0e316d95ad89949fe943630982946c8a6ed Author: Martin KaFai Lau Date: Fri Feb 7 00:18:10 2020 -0800 bpf: Improve bucket_log calculation logic commit 88d6f130e5632bbf419a2e184ec7adcbe241260b upstream. It was reported that the max_t, ilog2, and roundup_pow_of_two macros have exponential effects on the number of states in the sparse checker. This patch breaks them up by calculating the "nbuckets" first so that the "bucket_log" only needs to take ilog2(). In addition, Linus mentioned: Patch looks good, but I'd like to point out that it's not just sparse. You can see it with a simple make net/core/bpf_sk_storage.i grep 'smap->bucket_log = ' net/core/bpf_sk_storage.i | wc and see the end result: 1 365071 2686974 That's one line (the assignment line) that is 2,686,974 characters in length. Now, sparse does happen to react particularly badly to that (I didn't look to why, but I suspect it's just that evaluating all the types that don't actually ever end up getting used ends up being much more expensive than it should be), but I bet it's not good for gcc either. Fixes: 6ac99e8f23d4 ("bpf: Introduce bpf sk local storage") Reported-by: Randy Dunlap Reported-by: Luc Van Oostenryck Suggested-by: Linus Torvalds Signed-off-by: Martin KaFai Lau Signed-off-by: Daniel Borkmann Reviewed-by: Luc Van Oostenryck Link: https://lore.kernel.org/bpf/20200207081810.3918919-1-kafai@fb.com Signed-off-by: Greg Kroah-Hartman commit cb675fde4c44a888abe25f0a1ad289d520eabd28 Author: Jakub Sitnicki Date: Thu Feb 6 12:16:52 2020 +0100 selftests/bpf: Test freeing sockmap/sockhash with a socket in it commit 5d3919a953c3c96c02fc7a337f8376cde43ae31f upstream. Commit 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down") introduced sleeping issues inside RCU critical sections and while holding a spinlock on sockmap/sockhash tear-down. There has to be at least one socket in the map for the problem to surface. This adds a test that triggers the warnings for broken locking rules. Not a fix per se, but rather tooling to verify the accompanying fixes. Run on a VM with 1 vCPU to reproduce the warnings. Fixes: 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down") Signed-off-by: Jakub Sitnicki Signed-off-by: Daniel Borkmann Acked-by: John Fastabend Link: https://lore.kernel.org/bpf/20200206111652.694507-4-jakub@cloudflare.com Signed-off-by: Greg Kroah-Hartman commit 1098f9696152ae215abeca4ba3ace2cf3327f8b0 Author: Jakub Sitnicki Date: Thu Feb 6 12:16:51 2020 +0100 bpf, sockhash: Synchronize_rcu before free'ing map commit 0b2dc83906cf1e694e48003eae5df8fa63f76fd9 upstream. We need to have a synchronize_rcu before free'ing the sockhash because any outstanding psock references will have a pointer to the map and when they use it, this could trigger a use after free. This is a sister fix for sockhash, following commit 2bb90e5cc90e ("bpf: sockmap, synchronize_rcu before free'ing map") which addressed sockmap, which comes from a manual audit. Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Jakub Sitnicki Signed-off-by: Daniel Borkmann Acked-by: John Fastabend Link: https://lore.kernel.org/bpf/20200206111652.694507-3-jakub@cloudflare.com Signed-off-by: Greg Kroah-Hartman commit 657a17ce530e5b6d15b186cc7901960bded16e8b Author: Jakub Sitnicki Date: Thu Feb 6 12:16:50 2020 +0100 bpf, sockmap: Don't sleep while holding RCU lock on tear-down commit db6a5018b6e008c1d69c6628cdaa9541b8e70940 upstream. rcu_read_lock is needed to protect access to psock inside sock_map_unref when tearing down the map. However, we can't afford to sleep in lock_sock while in RCU read-side critical section. Grab the RCU lock only after we have locked the socket. This fixes RCU warnings triggerable on a VM with 1 vCPU when free'ing a sockmap/sockhash that contains at least one socket: | ============================= | WARNING: suspicious RCU usage | 5.5.0-04005-g8fc91b972b73 #450 Not tainted | ----------------------------- | include/linux/rcupdate.h:272 Illegal context switch in RCU read-side critical section! | | other info that might help us debug this: | | | rcu_scheduler_active = 2, debug_locks = 1 | 4 locks held by kworker/0:1/62: | #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0 | #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0 | #2: ffffffff82065d20 (rcu_read_lock){....}, at: sock_map_free+0x5/0x170 | #3: ffff8881368c5df8 (&stab->lock){+...}, at: sock_map_free+0x64/0x170 | | stack backtrace: | CPU: 0 PID: 62 Comm: kworker/0:1 Not tainted 5.5.0-04005-g8fc91b972b73 #450 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 | Workqueue: events bpf_map_free_deferred | Call Trace: | dump_stack+0x71/0xa0 | ___might_sleep+0x105/0x190 | lock_sock_nested+0x28/0x90 | sock_map_free+0x95/0x170 | bpf_map_free_deferred+0x58/0x80 | process_one_work+0x260/0x5e0 | worker_thread+0x4d/0x3e0 | kthread+0x108/0x140 | ? process_one_work+0x5e0/0x5e0 | ? kthread_park+0x90/0x90 | ret_from_fork+0x3a/0x50 | ============================= | WARNING: suspicious RCU usage | 5.5.0-04005-g8fc91b972b73-dirty #452 Not tainted | ----------------------------- | include/linux/rcupdate.h:272 Illegal context switch in RCU read-side critical section! | | other info that might help us debug this: | | | rcu_scheduler_active = 2, debug_locks = 1 | 4 locks held by kworker/0:1/62: | #0: ffff88813b019748 ((wq_completion)events){+.+.}, at: process_one_work+0x1d7/0x5e0 | #1: ffffc900000abe50 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x1d7/0x5e0 | #2: ffffffff82065d20 (rcu_read_lock){....}, at: sock_hash_free+0x5/0x1d0 | #3: ffff888139966e00 (&htab->buckets[i].lock){+...}, at: sock_hash_free+0x92/0x1d0 | | stack backtrace: | CPU: 0 PID: 62 Comm: kworker/0:1 Not tainted 5.5.0-04005-g8fc91b972b73-dirty #452 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 | Workqueue: events bpf_map_free_deferred | Call Trace: | dump_stack+0x71/0xa0 | ___might_sleep+0x105/0x190 | lock_sock_nested+0x28/0x90 | sock_hash_free+0xec/0x1d0 | bpf_map_free_deferred+0x58/0x80 | process_one_work+0x260/0x5e0 | worker_thread+0x4d/0x3e0 | kthread+0x108/0x140 | ? process_one_work+0x5e0/0x5e0 | ? kthread_park+0x90/0x90 | ret_from_fork+0x3a/0x50 Fixes: 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down") Signed-off-by: Jakub Sitnicki Signed-off-by: Daniel Borkmann Acked-by: John Fastabend Link: https://lore.kernel.org/bpf/20200206111652.694507-2-jakub@cloudflare.com Signed-off-by: Greg Kroah-Hartman commit 1dfc34bd009a49367c56e174e9205a16e9f36752 Author: Toke Høiland-Jørgensen Date: Thu Feb 6 11:29:06 2020 +0100 bpftool: Don't crash on missing xlated program instructions commit d95f1e8b462c4372ac409886070bb8719d8a4d3a upstream. Turns out the xlated program instructions can also be missing if kptr_restrict sysctl is set. This means that the previous fix to check the jited_prog_insns pointer was insufficient; add another check of the xlated_prog_insns pointer as well. Fixes: 5b79bcdf0362 ("bpftool: Don't crash on missing jited insns or ksyms") Fixes: cae73f233923 ("bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()") Signed-off-by: Toke Høiland-Jørgensen Signed-off-by: Daniel Borkmann Reviewed-by: Quentin Monnet Link: https://lore.kernel.org/bpf/20200206102906.112551-1-toke@redhat.com Signed-off-by: Greg Kroah-Hartman commit ec81471a70d100fed872f3f25e069e51afb356c4 Author: Avraham Stern Date: Fri Jan 31 15:45:27 2020 +0200 iwlwifi: mvm: avoid use after free for pmsr request commit cc4255eff523f25187bb95561642941de0e57497 upstream. When a FTM request is aborted, the driver sends the abort command to the fw and waits for a response. When the response arrives, the driver calls cfg80211_pmsr_complete() for that request. However, cfg80211 frees the requested data immediately after sending the abort command, so this may lead to use after free. Fix it by clearing the request data in the driver when the abort command arrives and ignoring the fw notification that will come afterwards. Signed-off-by: Avraham Stern Fixes: fc36ffda3267 ("iwlwifi: mvm: support FTM initiator") Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit b706a498638231b66cee74b20b2696bf95477f1d Author: Dongdong Liu Date: Thu Jan 23 16:26:31 2020 +0800 PCI/AER: Initialize aer_fifo commit d95f20c4f07020ebc605f3b46af4b6db9eb5fc99 upstream. Previously we did not call INIT_KFIFO() for aer_fifo. This leads to kfifo_put() sometimes returning 0 (queue full) when in fact it is not. It is easy to reproduce the problem by using aer-inject: $ aer-inject -s :82:00.0 multiple-corr-nonfatal The content of the multiple-corr-nonfatal file is as below: AER COR RCVR HL 0 1 2 3 AER UNCOR POISON_TLP HL 4 5 6 7 Fixes: 27c1ce8bbed7 ("PCI/AER: Use kfifo for tracking events instead of reimplementing it") Link: https://lore.kernel.org/r/1579767991-103898-1-git-send-email-liudongdong3@huawei.com Signed-off-by: Dongdong Liu Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman commit b51ac6e721d53b68011b0be2d317d460de6e14f7 Author: Logan Gunthorpe Date: Wed Jan 8 14:32:08 2020 -0700 PCI: Don't disable bridge BARs when assigning bus resources commit 9db8dc6d0785225c42a37be7b44d1b07b31b8957 upstream. Some PCI bridges implement BARs in addition to bridge windows. For example, here's a PLX switch: 04:00.0 PCI bridge: PLX Technology, Inc. PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA (rev ca) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 30, NUMA node 0 Memory at 90a00000 (32-bit, non-prefetchable) [size=256K] Bus: primary=04, secondary=05, subordinate=0a, sec-latency=0 I/O behind bridge: 00002000-00003fff Memory behind bridge: 90000000-909fffff Prefetchable memory behind bridge: 0000380000800000-0000380000bfffff Previously, when the kernel assigned resource addresses (with the pci=realloc command line parameter, for example) it could clear the struct resource corresponding to the BAR. When this happened, lspci would report this BAR as "ignored": Region 0: Memory at (32-bit, non-prefetchable) [size=256K] This is because the kernel reports a zero start address and zero flags in the corresponding sysfs resource file and in /proc/bus/pci/devices. Investigation with 'lspci -x', however, shows the BIOS-assigned address will still be programmed in the device's BAR registers. It's clearly a bug that the kernel lost track of the BAR value, but in most cases, this still won't result in a visible issue because nothing uses the memory, so nothing is affected. However, when an IOMMU is in use, it will not reserve this space in the IOVA because the kernel no longer thinks the range is valid. (See dmar_init_reserved_ranges() for the Intel implementation of this.) Without the proper reserved range, a DMA mapping may allocate an IOVA that matches a bridge BAR, which results in DMA accesses going to the BAR instead of the intended RAM. The problem was in pci_assign_unassigned_root_bus_resources(). When any resource from a bridge device fails to get assigned, the code set the resource's flags to zero. This makes sense for bridge windows, as they will be re-enabled later, but for regular BARs, it makes the kernel permanently lose track of the fact that they decode address space. Change pci_assign_unassigned_root_bus_resources() and pci_assign_unassigned_bridge_resources() so they only clear "res->flags" for bridge *windows*, not bridge BARs. Fixes: da7822e5ad71 ("PCI: update bridge resources to get more big ranges when allocating space (again)") Link: https://lore.kernel.org/r/20200108213208.4612-1-logang@deltatee.com [bhelgaas: commit log, check for pci_is_bridge()] Reported-by: Kit Chow Signed-off-by: Logan Gunthorpe Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman commit 67016624a0be081d360ece058c1db2149e71fcca Author: Marcel Ziswiler Date: Tue Jan 7 09:14:02 2020 +0100 PCI: tegra: Fix afi_pex2_ctrl reg offset for Tegra30 commit 21a92676e1fe292acb077b13106b08c22ed36b14 upstream. Fix AFI_PEX2_CTRL reg offset for Tegra30 by moving it from the Tegra20 SoC struct where it erroneously got added. This fixes the AFI_PEX2_CTRL reg offset being uninitialised subsequently failing to bring up the third PCIe port. Fixes: adb2653b3d2e ("PCI: tegra: Add AFI_PEX2_CTRL reg offset as part of SoC struct") Signed-off-by: Marcel Ziswiler Signed-off-by: Lorenzo Pieralisi Reviewed-by: Andrew Murray Acked-by: Thierry Reding Signed-off-by: Greg Kroah-Hartman commit df26f04f23bd252219e6704f3a4c58c48415aa22 Author: Logan Gunthorpe Date: Mon Jan 6 12:03:27 2020 -0700 PCI/switchtec: Fix vep_vector_number ioread width commit 9375646b4cf03aee81bc6c305aa18cc80b682796 upstream. vep_vector_number is actually a 16 bit register which should be read with ioread16() instead of ioread32(). Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver") Link: https://lore.kernel.org/r/20200106190337.2428-3-logang@deltatee.com Reported-by: Doug Meyer Signed-off-by: Logan Gunthorpe Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman commit b72b8d0725b0fa3d6ae991824be7df74ecf8077f Author: Wesley Sheng Date: Mon Jan 6 12:03:26 2020 -0700 PCI/switchtec: Use dma_set_mask_and_coherent() commit aa82130a22f77c1aa5794703730304d035a0c1f4 upstream. Use dma_set_mask_and_coherent() instead of dma_set_coherent_mask() as the Switchtec hardware fully supports 64bit addressing and we should set both the streaming and coherent masks the same. [logang@deltatee.com: reworked commit message] Fixes: aff614c6339c ("switchtec: Set DMA coherent mask") Link: https://lore.kernel.org/r/20200106190337.2428-2-logang@deltatee.com Signed-off-by: Wesley Sheng Signed-off-by: Logan Gunthorpe Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman commit 15818c08ca79d3798da9e9f0225026abc47a379a Author: Bryan O'Donoghue Date: Thu Dec 19 13:15:38 2019 +0000 ath10k: pci: Only dump ATH10K_MEM_REGION_TYPE_IOREG when safe commit d239380196c4e27a26fa4bea73d2bf994c14ec2d upstream. ath10k_pci_dump_memory_reg() will try to access memory of type ATH10K_MEM_REGION_TYPE_IOREG however, if a hardware restart is in progress this can crash a system. Individual ioread32() time has been observed to jump from 15-20 ticks to > 80k ticks followed by a secure-watchdog bite and a system reset. Work around this corner case by only issuing the read transaction when the driver state is ATH10K_STATE_ON. Tested-on: QCA9988 PCI 10.4-3.9.0.2-00044 Fixes: 219cc084c6706 ("ath10k: add memory dump support QCA9984") Signed-off-by: Bryan O'Donoghue Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit 4f0e6425a2da71f6c0a2e4d78f9231ca6778f71b Author: Navid Emamdoost Date: Mon Nov 25 13:52:52 2019 -0600 PCI/IOV: Fix memory leak in pci_iov_add_virtfn() commit 8c386cc817878588195dde38e919aa6ba9409d58 upstream. In the implementation of pci_iov_add_virtfn() the allocated virtfn is leaked if pci_setup_device() fails. The error handling is not calling pci_stop_and_remove_bus_device(). Change the goto label to failed2. Fixes: 156c55325d30 ("PCI: Check for pci_setup_device() failure in pci_iov_add_virtfn()") Link: https://lore.kernel.org/r/20191125195255.23740-1-navid.emamdoost@gmail.com Signed-off-by: Navid Emamdoost Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman commit da268240fb0a0be2112bbde938812a4958e7f3de Author: Bean Huo Date: Mon Jan 20 14:08:13 2020 +0100 scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails commit b9fc5320212efdfb4e08b825aaa007815fd11d16 upstream. A non-zero error value likely being returned by ufshcd_scsi_add_wlus() in case of failure of adding the WLs, but ufshcd_probe_hba() doesn't use this value, and doesn't report this failure to upper caller. This patch is to fix this issue. Fixes: 2a8fa600445c ("ufs: manually add well known logical units") Link: https://lore.kernel.org/r/20200120130820.1737-2-huobean@gmail.com Reviewed-by: Asutosh Das Reviewed-by: Alim Akhtar Reviewed-by: Stanley Chu Signed-off-by: Bean Huo Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 21702236f3520d28981267bd4bf5ca68fd9231db Author: Artemy Kovalyov Date: Tue Jan 28 15:56:12 2020 +0200 RDMA/umem: Fix ib_umem_find_best_pgsz() commit 36798d5ae1af62e830c5e045b2e41ce038690c61 upstream. Except for the last entry, the ending iova alignment sets the maximum possible page size as the low bits of the iova must be zero when starting the next chunk. Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR") Link: https://lore.kernel.org/r/20200128135612.174820-1-leon@kernel.org Signed-off-by: Artemy Kovalyov Signed-off-by: Leon Romanovsky Tested-by: Gal Pressman Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 56b22525ab803752197a64baddf1cb927a424937 Author: Parav Pandit Date: Sun Jan 26 16:26:46 2020 +0200 RDMA/cma: Fix unbalanced cm_id reference count during address resolve commit b4fb4cc5ba83b20dae13cef116c33648e81d2f44 upstream. Below commit missed the AF_IB and loopback code flow in rdma_resolve_addr(). This leads to an unbalanced cm_id refcount in cma_work_handler() which puts the refcount which was not incremented prior to queuing the work. A call trace is observed with such code flow: BUG: unable to handle kernel NULL pointer dereference at (null) [] __mutex_lock_slowpath+0x166/0x1d0 [] mutex_lock+0x1f/0x2f [] cma_work_handler+0x25/0xa0 [] process_one_work+0x17f/0x440 [] worker_thread+0x126/0x3c0 Hence, hold the cm_id reference when scheduling the resolve work item. Fixes: 722c7b2bfead ("RDMA/{cma, core}: Avoid callback on rdma_addr_cancel()") Link: https://lore.kernel.org/r/20200126142652.104803-2-leon@kernel.org Signed-off-by: Parav Pandit Signed-off-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit b73401025a14b11f229526af4e799fa796284fe6 Author: Michael Guralnik Date: Wed Jan 8 20:05:35 2020 +0200 RDMA/uverbs: Verify MR access flags commit ca95c1411198c2d87217c19d44571052cdc94725 upstream. Verify that MR access flags that are passed from user are all supported ones, otherwise an error is returned. Fixes: 4fca03778351 ("IB/uverbs: Move ib_access_flags and ib_read_counters_flags to uapi") Link: https://lore.kernel.org/r/1578506740-22188-6-git-send-email-yishaih@mellanox.com Signed-off-by: Michael Guralnik Signed-off-by: Yishai Hadas Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 78923675151e24668be3b618c13647c7475fa64f Author: Jason Gunthorpe Date: Wed Jan 8 19:22:03 2020 +0200 RDMA/core: Fix locking in ib_uverbs_event_read commit 14e23bd6d22123f6f3b2747701fa6cd4c6d05873 upstream. This should not be using ib_dev to test for disassociation, during disassociation is_closed is set under lock and the waitq is triggered. Instead check is_closed and be sure to re-obtain the lock to test the value after the wait_event returns. Fixes: 036b10635739 ("IB/uverbs: Enable device removal when there are active user space applications") Link: https://lore.kernel.org/r/1578504126-9400-12-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas Reviewed-by: Håkon Bugge Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 33daaea78a1f4463f6849b9e1e9e77996c263a92 Author: Xiyu Yang Date: Mon Dec 30 10:24:28 2019 +0800 RDMA/i40iw: fix a potential NULL pointer dereference commit 04db1580b5e48a79e24aa51ecae0cd4b2296ec23 upstream. A NULL pointer can be returned by in_dev_get(). Thus add a corresponding check so that a NULL pointer dereference will be avoided at this place. Fixes: 8e06af711bf2 ("i40iw: add main, hdr, status") Link: https://lore.kernel.org/r/1577672668-46499-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Xiyu Yang Signed-off-by: Xin Tan Reviewed-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit b1f90d263a3b2b6ca1bc49b3542aaf13e6df7eba Author: Håkon Bugge Date: Mon Dec 16 13:04:36 2019 +0100 RDMA/netlink: Do not always generate an ACK for some netlink operations commit a242c36951ecd24bc16086940dbe6b522205c461 upstream. In rdma_nl_rcv_skb(), the local variable err is assigned the return value of the supplied callback function, which could be one of ib_nl_handle_resolve_resp(), ib_nl_handle_set_timeout(), or ib_nl_handle_ip_res_resp(). These three functions all return skb->len on success. rdma_nl_rcv_skb() is merely a copy of netlink_rcv_skb(). The callback functions used by the latter have the convention: "Returns 0 on success or a negative error code". In particular, the statement (equal for both functions): if (nlh->nlmsg_flags & NLM_F_ACK || err) implies that rdma_nl_rcv_skb() always will ack a message, independent of the NLM_F_ACK being set in nlmsg_flags or not. The fix could be to change the above statement, but it is better to keep the two *_rcv_skb() functions equal in this respect and instead change the three callback functions in the rdma subsystem to the correct convention. Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink") Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload") Link: https://lore.kernel.org/r/20191216120436.3204814-1-haakon.bugge@oracle.com Suggested-by: Mark Haywood Signed-off-by: Håkon Bugge Tested-by: Mark Haywood Reviewed-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 839fb9e04cd7edcbe0d8ae712a52f4d6522af5b8 Author: Håkon Bugge Date: Thu Jan 23 16:55:21 2020 +0100 IB/mlx4: Fix leak in id_map_find_del commit ea660ad7c1c476fd6e5e3b17780d47159db71dea upstream. Using CX-3 virtual functions, either from a bare-metal machine or pass-through from a VM, MAD packets are proxied through the PF driver. Since the VF drivers have separate name spaces for MAD Transaction Ids (TIDs), the PF driver has to re-map the TIDs and keep the book keeping in a cache. Following the RDMA Connection Manager (CM) protocol, it is clear when an entry has to evicted from the cache. When a DREP is sent from mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar when a REJ is received by the mlx4_ib_demux_cm_handler(), id_map_find_del() is called. This function wipes out the TID in use from the IDR or XArray and removes the id_map_entry from the table. In short, it does everything except the topping of the cake, which is to remove the entry from the list and free it. In other words, for the REJ case enumerated above, one id_map_entry will be leaked. For the other case above, a DREQ has been received first. The reception of the DREQ will trigger queuing of a delayed work to delete the id_map_entry, for the case where the VM doesn't send back a DREP. In the normal case, the VM _will_ send back a DREP, and id_map_find_del() will be called. But this scenario introduces a secondary leak. First, when the DREQ is received, a delayed work is queued. The VM will then return a DREP, which will call id_map_find_del(). As stated above, this will free the TID used from the XArray or IDR. Now, there is window where that particular TID can be re-allocated, lets say by an outgoing REQ. This TID will later be wiped out by the delayed work, when the function id_map_ent_timeout() is called. But the id_map_entry allocated by the outgoing REQ will not be de-allocated, and we have a leak. Both leaks are fixed by removing the id_map_find_del() function and only using schedule_delayed(). Of course, a check in schedule_delayed() to see if the work already has been queued, has been added. Another benefit of always using the delayed version for deleting entries, is that we do get a TimeWait effect; a TID no longer in use, will occupy the XArray or IDR for CM_CLEANUP_CACHE_TIMEOUT time, without any ability of being re-used for that time period. Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization") Link: https://lore.kernel.org/r/20200123155521.1212288-1-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge Signed-off-by: Manjunath Patil Reviewed-by: Rama Nichanamatlu Reviewed-by: Jack Morgenstein Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 996dc3d50add6069079fe82d8e17a0050cb22131 Author: Sergey Gorenko Date: Wed Jan 15 13:30:55 2020 +0000 IB/srp: Never use immediate data if it is disabled by a user commit 0fbb37dd82998b5c83355997b3bdba2806968ac7 upstream. Some SRP targets that do not support specification SRP-2, put the garbage to the reserved bits of the SRP login response. The problem was not detected for a long time because the SRP initiator ignored those bits. But now one of them is used as SRP_LOGIN_RSP_IMMED_SUPP. And it causes a critical error on the target when the initiator sends immediate data. The ib_srp module has a use_imm_date parameter to enable or disable immediate data manually. But it does not help in the above case, because use_imm_date is ignored at handling the SRP login response. The problem is definitely caused by a bug on the target side, but the initiator's behavior also does not look correct. The initiator should not use immediate data if use_imm_date is disabled by a user. This commit adds an additional checking of use_imm_date at the handling of SRP login response to avoid unexpected use of immediate data. Fixes: 882981f4a411 ("RDMA/srp: Add support for immediate data") Link: https://lore.kernel.org/r/20200115133055.30232-1-sergeygo@mellanox.com Signed-off-by: Sergey Gorenko Reviewed-by: Bart Van Assche Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 56f5f41e80b32dc66eb752270f28944629f0a017 Author: Jack Morgenstein Date: Wed Jan 15 10:50:50 2020 +0200 IB/mlx4: Fix memory leak in add_gid error flow commit eaad647e5cc27f7b46a27f3b85b14c4c8a64bffa upstream. In procedure mlx4_ib_add_gid(), if the driver is unable to update the FW gid table, there is a memory leak in the driver's copy of the gid table: the gid entry's context buffer is not freed. If such an error occurs, free the entry's context buffer, and mark the entry as available (by setting its context pointer to NULL). Fixes: e26be1bfef81 ("IB/mlx4: Implement ib_device callbacks") Link: https://lore.kernel.org/r/20200115085050.73746-1-leon@kernel.org Signed-off-by: Jack Morgenstein Reviewed-by: Parav Pandit Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit b96c27b1891ba4318af11dd09318f2993e71bc8b Author: Sunil Muthuswamy Date: Fri Jan 24 03:08:18 2020 +0000 hv_sock: Remove the accept port restriction [ Upstream commit c742c59e1fbd022b64d91aa9a0092b3a699d653c ] Currently, hv_sock restricts the port the guest socket can accept connections on. hv_sock divides the socket port namespace into two parts for server side (listening socket), 0-0x7FFFFFFF & 0x80000000-0xFFFFFFFF (there are no restrictions on client port namespace). The first part (0-0x7FFFFFFF) is reserved for sockets where connections can be accepted. The second part (0x80000000-0xFFFFFFFF) is reserved for allocating ports for the peer (host) socket, once a connection is accepted. This reservation of the port namespace is specific to hv_sock and not known by the generic vsock library (ex: af_vsock). This is problematic because auto-binds/ephemeral ports are handled by the generic vsock library and it has no knowledge of this port reservation and could allocate a port that is not compatible with hv_sock (and legitimately so). The issue hasn't surfaced so far because the auto-bind code of vsock (__vsock_bind_stream) prior to the change 'VSOCK: bind to random port for VMADDR_PORT_ANY' would start walking up from LAST_RESERVED_PORT (1023) and start assigning ports. That will take a large number of iterations to hit 0x7FFFFFFF. But, after the above change to randomize port selection, the issue has started coming up more frequently. There has really been no good reason to have this port reservation logic in hv_sock from the get go. Reserving a local port for peer ports is not how things are handled generally. Peer ports should reflect the peer port. This fixes the issue by lifting the port reservation, and also returns the right peer port. Since the code converts the GUID to the peer port (by using the first 4 bytes), there is a possibility of conflicts, but that seems like a reasonable risk to take, given this is limited to vsock and that only applies to all local sockets. Signed-off-by: Sunil Muthuswamy Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f7775193b64344079f5555387548f6356640244d Author: Ranjani Sridharan Date: Mon Nov 4 14:48:11 2019 -0800 ASoC: pcm: update FE/BE trigger order based on the command [ Upstream commit acbf27746ecfa96b290b54cc7f05273482ea128a ] Currently, the trigger orders SND_SOC_DPCM_TRIGGER_PRE/POST determine the order in which FE DAI and BE DAI are triggered. In the case of SND_SOC_DPCM_TRIGGER_PRE, the FE DAI is triggered before the BE DAI and in the case of SND_SOC_DPCM_TRIGGER_POST, the BE DAI is triggered before the FE DAI. And this order remains the same irrespective of the trigger command. In the case of the SOF driver, during playback, the FW expects the BE DAI to be triggered before the FE DAI during the START trigger. The BE DAI trigger handles the starting of Link DMA and so it must be started before the FE DAI is started to prevent xruns during pause/release. This can be addressed by setting the trigger order for the FE dai link to SND_SOC_DPCM_TRIGGER_POST. But during the STOP trigger, the FW expects the FE DAI to be triggered before the BE DAI. Retaining the same order during the START and STOP commands, results in FW error as the DAI component in the FW is still active. The issue can be fixed by mirroring the trigger order of FE and BE DAI's during the START and STOP trigger. So, with the trigger order set to SND_SOC_DPCM_TRIGGER_PRE, the FE DAI will be trigger first during SNDRV_PCM_TRIGGER_START/STOP/RESUME and the BE DAI will be triggered first during the STOP/SUSPEND/PAUSE commands. Conversely, with the trigger order set to SND_SOC_DPCM_TRIGGER_POST, the BE DAI will be triggered first during the SNDRV_PCM_TRIGGER_START/STOP/RESUME commands and the FE DAI will be triggered first during the SNDRV_PCM_TRIGGER_STOP/SUSPEND/PAUSE commands. Signed-off-by: Ranjani Sridharan Signed-off-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20191104224812.3393-2-ranjani.sridharan@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit d6591ea2dd1a44b1c72c5a3e3b6555d7585acdae Author: Greg Kroah-Hartman Date: Tue Feb 11 04:35:55 2020 -0800 Linux 5.4.19 commit 866323ccc1388f68b866396dc06991b83c2ec2a3 Author: Christophe Leroy Date: Fri Jan 24 11:54:41 2020 +0000 powerpc/kuap: Fix set direction in allow/prevent_user_access() [ Upstream commit 1d8f739b07bd538f272f60bf53f10e7e6248d295 ] __builtin_constant_p() always return 0 for pointers, so on RADIX we always end up opening both direction (by writing 0 in SPR29): 0000000000000170 <._copy_to_user>: ... 1b0: 4c 00 01 2c isync 1b4: 39 20 00 00 li r9,0 1b8: 7d 3d 03 a6 mtspr 29,r9 1bc: 4c 00 01 2c isync 1c0: 48 00 00 01 bl 1c0 <._copy_to_user+0x50> 1c0: R_PPC64_REL24 .__copy_tofrom_user ... 0000000000000220 <._copy_from_user>: ... 2ac: 4c 00 01 2c isync 2b0: 39 20 00 00 li r9,0 2b4: 7d 3d 03 a6 mtspr 29,r9 2b8: 4c 00 01 2c isync 2bc: 7f c5 f3 78 mr r5,r30 2c0: 7f 83 e3 78 mr r3,r28 2c4: 48 00 00 01 bl 2c4 <._copy_from_user+0xa4> 2c4: R_PPC64_REL24 .__copy_tofrom_user ... Use an explicit parameter for direction selection, so that GCC is able to see it is a constant: 00000000000001b0 <._copy_to_user>: ... 1f0: 4c 00 01 2c isync 1f4: 3d 20 40 00 lis r9,16384 1f8: 79 29 07 c6 rldicr r9,r9,32,31 1fc: 7d 3d 03 a6 mtspr 29,r9 200: 4c 00 01 2c isync 204: 48 00 00 01 bl 204 <._copy_to_user+0x54> 204: R_PPC64_REL24 .__copy_tofrom_user ... 0000000000000260 <._copy_from_user>: ... 2ec: 4c 00 01 2c isync 2f0: 39 20 ff ff li r9,-1 2f4: 79 29 00 04 rldicr r9,r9,0,0 2f8: 7d 3d 03 a6 mtspr 29,r9 2fc: 4c 00 01 2c isync 300: 7f c5 f3 78 mr r5,r30 304: 7f 83 e3 78 mr r3,r28 308: 48 00 00 01 bl 308 <._copy_from_user+0xa8> 308: R_PPC64_REL24 .__copy_tofrom_user ... Signed-off-by: Christophe Leroy [mpe: Spell out the directions, s/KUAP_R/KUAP_READ/ etc.] Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/f4e88ec4941d5facb35ce75026b0112f980086c3.1579866752.git.christophe.leroy@c-s.fr Signed-off-by: Sasha Levin commit 3556d66be3f1d181d4b26d112e7953159a69752c Author: Stephen Rothwell Date: Wed Jan 15 12:02:58 2020 +1100 regulator fix for "regulator: core: Add regulator_is_equal() helper" [ Upstream commit 0468e667a5bead9c1b7ded92861b5a98d8d78745 ] Signed-off-by: Stephen Rothwell Link: https://lore.kernel.org/r/20200115120258.0e535fcb@canb.auug.org.au Acked-by: Marek Vasut Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 0f51165c22e425af2cb243b8a8e891f1ae9e34a8 Author: David Howells Date: Thu Feb 6 13:55:01 2020 +0000 rxrpc: Fix service call disconnection [ Upstream commit b39a934ec72fa2b5a74123891f25273a38378b90 ] The recent patch that substituted a flag on an rxrpc_call for the connection pointer being NULL as an indication that a call was disconnected puts the set_bit in the wrong place for service calls. This is only a problem if a call is implicitly terminated by a new call coming in on the same connection channel instead of a terminating ACK packet. In such a case, rxrpc_input_implicit_end_call() calls __rxrpc_disconnect_call(), which is now (incorrectly) setting the disconnection bit, meaning that when rxrpc_release_call() is later called, it doesn't call rxrpc_disconnect_call() and so the call isn't removed from the peer's error distribution list and the list gets corrupted. KASAN finds the issue as an access after release on a call, but the position at which it occurs is confusing as it appears to be related to a different call (the call site is where the latter call is being removed from the error distribution list and either the next or pprev pointer points to a previously released call). Fix this by moving the setting of the flag from __rxrpc_disconnect_call() to rxrpc_disconnect_call() in the same place that the connection pointer was being cleared. Fixes: 5273a191dca6 ("rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect") Signed-off-by: David Howells Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 743823969080364a2ee04c20fa1d0411028554f5 Author: Song Liu Date: Thu Jan 23 10:11:46 2020 -0800 perf/core: Fix mlock accounting in perf_mmap() commit 003461559ef7a9bd0239bae35a22ad8924d6e9ad upstream. Decreasing sysctl_perf_event_mlock between two consecutive perf_mmap()s of a perf ring buffer may lead to an integer underflow in locked memory accounting. This may lead to the undesired behaviors, such as failures in BPF map creation. Address this by adjusting the accounting logic to take into account the possibility that the amount of already locked memory may exceed the current limit. Fixes: c4b75479741c ("perf/core: Make the mlock accounting simple again") Suggested-by: Alexander Shishkin Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Cc: Acked-by: Alexander Shishkin Link: https://lkml.kernel.org/r/20200123181146.2238074-1-songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman commit d1318034e9e90180ed233b609b249c316205d056 Author: Konstantin Khlebnikov Date: Fri Jan 31 19:08:59 2020 +0300 clocksource: Prevent double add_timer_on() for watchdog_timer commit febac332a819f0e764aa4da62757ba21d18c182b upstream. Kernel crashes inside QEMU/KVM are observed: kernel BUG at kernel/time/timer.c:1154! BUG_ON(timer_pending(timer) || !timer->function) in add_timer_on(). At the same time another cpu got: general protection fault: 0000 [#1] SMP PTI of poinson pointer 0xdead000000000200 in: __hlist_del at include/linux/list.h:681 (inlined by) detach_timer at kernel/time/timer.c:818 (inlined by) expire_timers at kernel/time/timer.c:1355 (inlined by) __run_timers at kernel/time/timer.c:1686 (inlined by) run_timer_softirq at kernel/time/timer.c:1699 Unfortunately kernel logs are badly scrambled, stacktraces are lost. Printing the timer->function before the BUG_ON() pointed to clocksource_watchdog(). The execution of clocksource_watchdog() can race with a sequence of clocksource_stop_watchdog() .. clocksource_start_watchdog(): expire_timers() detach_timer(timer, true); timer->entry.pprev = NULL; raw_spin_unlock_irq(&base->lock); call_timer_fn clocksource_watchdog() clocksource_watchdog_kthread() or clocksource_unbind() spin_lock_irqsave(&watchdog_lock, flags); clocksource_stop_watchdog(); del_timer(&watchdog_timer); watchdog_running = 0; spin_unlock_irqrestore(&watchdog_lock, flags); spin_lock_irqsave(&watchdog_lock, flags); clocksource_start_watchdog(); add_timer_on(&watchdog_timer, ...); watchdog_running = 1; spin_unlock_irqrestore(&watchdog_lock, flags); spin_lock(&watchdog_lock); add_timer_on(&watchdog_timer, ...); BUG_ON(timer_pending(timer) || !timer->function); timer_pending() -> true BUG() I.e. inside clocksource_watchdog() watchdog_timer could be already armed. Check timer_pending() before calling add_timer_on(). This is sufficient as all operations are synchronized by watchdog_lock. Fixes: 75c5158f70c0 ("timekeeping: Update clocksource with stop_machine") Signed-off-by: Konstantin Khlebnikov Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/158048693917.4378.13823603769948933793.stgit@buzz Signed-off-by: Greg Kroah-Hartman commit d15b033e960389b6bb626f74b83a8b057b4b4fe3 Author: Thomas Gleixner Date: Fri Jan 31 15:26:52 2020 +0100 x86/apic/msi: Plug non-maskable MSI affinity race commit 6f1a4891a5928a5969c87fa5a584844c983ec823 upstream. Evan tracked down a subtle race between the update of the MSI message and the device raising an interrupt internally on PCI devices which do not support MSI masking. The update of the MSI message is non-atomic and consists of either 2 or 3 sequential 32bit wide writes to the PCI config space. - Write address low 32bits - Write address high 32bits (If supported by device) - Write data When an interrupt is migrated then both address and data might change, so the kernel attempts to mask the MSI interrupt first. But for MSI masking is optional, so there exist devices which do not provide it. That means that if the device raises an interrupt internally between the writes then a MSI message is sent built from half updated state. On x86 this can lead to spurious interrupts on the wrong interrupt vector when the affinity setting changes both address and data. As a consequence the device interrupt can be lost causing the device to become stuck or malfunctioning. Evan tried to handle that by disabling MSI accross an MSI message update. That's not feasible because disabling MSI has issues on its own: If MSI is disabled the PCI device is routing an interrupt to the legacy INTx mechanism. The INTx delivery can be disabled, but the disablement is not working on all devices. Some devices lose interrupts when both MSI and INTx delivery are disabled. Another way to solve this would be to enforce the allocation of the same vector on all CPUs in the system for this kind of screwed devices. That could be done, but it would bring back the vector space exhaustion problems which got solved a few years ago. Fortunately the high address (if supported by the device) is only relevant when X2APIC is enabled which implies interrupt remapping. In the interrupt remapping case the affinity setting is happening at the interrupt remapping unit and the PCI MSI message is programmed only once when the PCI device is initialized. That makes it possible to solve it with a two step update: 1) Target the MSI msg to the new vector on the current target CPU 2) Target the MSI msg to the new vector on the new target CPU In both cases writing the MSI message is only changing a single 32bit word which prevents the issue of inconsistency. After writing the final destination it is necessary to check whether the device issued an interrupt while the intermediate state #1 (new vector, current CPU) was in effect. This is possible because the affinity change is always happening on the current target CPU. The code runs with interrupts disabled, so the interrupt can be detected by checking the IRR of the local APIC. If the vector is pending in the IRR then the interrupt is retriggered on the new target CPU by sending an IPI for the associated vector on the target CPU. This can cause spurious interrupts on both the local and the new target CPU. 1) If the new vector is not in use on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then interrupt entry code will ignore that spurious interrupt. The vector is marked so that the 'No irq handler for vector' warning is supressed once. 2) If the new vector is in use already on the local CPU then the IRR check might see an pending interrupt from the device which is using this vector. The IPI to the new target CPU will then invoke the handler of the device, which got the affinity change, even if that device did not issue an interrupt 3) If the new vector is in use already on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then the handler of the device which uses that vector on the local CPU will be invoked. expose issues in device driver interrupt handlers which are not prepared to handle a spurious interrupt correctly. This not a regression, it's just exposing something which was already broken as spurious interrupts can happen for a lot of reasons and all driver handlers need to be able to deal with them. Reported-by: Evan Green Debugged-by: Evan Green Signed-off-by: Thomas Gleixner Tested-by: Evan Green Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87imkr4s7n.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman commit b64d7f7af8f956befd43ea0afa870edcffe7f4b6 Author: Ronnie Sahlberg Date: Wed Feb 5 11:08:01 2020 +1000 cifs: fail i/o on soft mounts if sessionsetup errors out commit b0dd940e582b6a60296b9847a54012a4b080dc72 upstream. RHBZ: 1579050 If we have a soft mount we should fail commands for session-setup failures (such as the password having changed/ account being deleted/ ...) and return an error back to the application. Signed-off-by: Ronnie Sahlberg Signed-off-by: Steve French CC: Stable Signed-off-by: Greg Kroah-Hartman commit 3e3e8551a5a2537c002ac9d9fb0d68c93de303d5 Author: Sean Christopherson Date: Wed Jan 8 12:24:38 2020 -0800 KVM: Play nice with read-only memslots when querying host page size [ Upstream commit 42cde48b2d39772dba47e680781a32a6c4b7dc33 ] Avoid the "writable" check in __gfn_to_hva_many(), which will always fail on read-only memslots due to gfn_to_hva() assuming writes. Functionally, this allows x86 to create large mappings for read-only memslots that are backed by HugeTLB mappings. Note, the changelog for commit 05da45583de9 ("KVM: MMU: large page support") states "If the largepage contains write-protected pages, a large pte is not used.", but "write-protected" refers to pages that are temporarily read-only, e.g. read-only memslots didn't even exist at the time. Fixes: 4d8b81abc47b ("KVM: introduce readonly memslot") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson [Redone using kvm_vcpu_gfn_to_memslot_prot. - Paolo] Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 7426ddf01f1639c19a79d93c5a354b463a563d29 Author: Sean Christopherson Date: Wed Jan 8 12:24:37 2020 -0800 KVM: Use vcpu-specific gva->hva translation when querying host page size [ Upstream commit f9b84e19221efc5f493156ee0329df3142085f28 ] Use kvm_vcpu_gfn_to_hva() when retrieving the host page size so that the correct set of memslots is used when handling x86 page faults in SMM. Fixes: 54bf36aac520 ("KVM: x86: use vcpu-specific functions to read/write/translate GFNs") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 09bd0033df05c282ac5a5d74f5448128157b70aa Author: Miaohe Lin Date: Sat Dec 28 14:25:24 2019 +0800 KVM: nVMX: vmread should not set rflags to specify success in case of #PF [ Upstream commit a4d956b9390418623ae5d07933e2679c68b6f83c ] In case writing to vmread destination operand result in a #PF, vmread should not call nested_vmx_succeed() to set rflags to specify success. Similar to as done in VMPTRST (See handle_vmptrst()). Reviewed-by: Liran Alon Signed-off-by: Miaohe Lin Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 1d6cfa003c210f48ec9657296229927caec00c4c Author: Paolo Bonzini Date: Sat Jan 18 20:09:03 2020 +0100 KVM: x86: fix overlap between SPTE_MMIO_MASK and generation [ Upstream commit 56871d444bc4d7ea66708775e62e2e0926384dbc ] The SPTE_MMIO_MASK overlaps with the bits used to track MMIO generation number. A high enough generation number would overwrite the SPTE_SPECIAL_MASK region and cause the MMIO SPTE to be misinterpreted. Likewise, setting bits 52 and 53 would also cause an incorrect generation number to be read from the PTE, though this was partially mitigated by the (useless if it weren't for the bug) removal of SPTE_SPECIAL_MASK from the spte in get_mmio_spte_generation. Drop that removal, and replace it with a compile-time assertion. Fixes: 6eeb4ef049e7 ("KVM: x86: assign two bits to track SPTE kinds") Reported-by: Ben Gardon Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 8a1cd01bee30bd1033a452035f66be127728d4fd Author: Sean Christopherson Date: Fri Dec 6 15:57:14 2019 -0800 KVM: x86: Use gpa_t for cr2/gpa to fix TDP support on 32-bit KVM [ Upstream commit 736c291c9f36b07f8889c61764c28edce20e715d ] Convert a plethora of parameters and variables in the MMU and page fault flows from type gva_t to gpa_t to properly handle TDP on 32-bit KVM. Thanks to PSE and PAE paging, 32-bit kernels can access 64-bit physical addresses. When TDP is enabled, the fault address is a guest physical address and thus can be a 64-bit value, even when both KVM and its guest are using 32-bit virtual addressing, e.g. VMX's VMCS.GUEST_PHYSICAL is a 64-bit field, not a natural width field. Using a gva_t for the fault address means KVM will incorrectly drop the upper 32-bits of the GPA. Ditto for gva_to_gpa() when it is used to translate L2 GPAs to L1 GPAs. Opportunistically rename variables and parameters to better reflect the dual address modes, e.g. use "cr2_or_gpa" for fault addresses and plain "addr" instead of "vaddr" when the address may be either a GVA or an L2 GPA. Similarly, use "gpa" in the nonpaging_page_fault() flows to avoid a confusing "gpa_t gva" declaration; this also sets the stage for a future patch to combing nonpaging_page_fault() and tdp_page_fault() with minimal churn. Sprinkle in a few comments to document flows where an address is known to be a GVA and thus can be safely truncated to a 32-bit value. Add WARNs in kvm_handle_page_fault() and FNAME(gva_to_gpa_nested)() to help document such cases and detect bugs. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit fc46f8a115e57ca0e4fba171aa79731f40d5217a Author: Paolo Bonzini Date: Wed Dec 4 15:50:27 2019 +0100 KVM: x86: use CPUID to locate host page table reserved bits [ Upstream commit 7adacf5eb2d2048045d9fd8fdab861fd9e7e2e96 ] The comment in kvm_get_shadow_phys_bits refers to MKTME, but the same is actually true of SME and SEV. Just use CPUID[0x8000_0008].EAX[7:0] unconditionally if available, it is simplest and works even if memory is not encrypted. Cc: stable@vger.kernel.org Reported-by: Tom Lendacky Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit f805ec37828b96a9ff48ef5db0b2adff573f977a Author: Sean Christopherson Date: Tue Jan 7 16:12:10 2020 -0800 KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVM [ Upstream commit e30a7d623dccdb3f880fbcad980b0cb589a1da45 ] Remove the bogus 64-bit only condition from the check that disables MMIO spte optimization when the system supports the max PA, i.e. doesn't have any reserved PA bits. 32-bit KVM always uses PAE paging for the shadow MMU, and per Intel's SDM: PAE paging translates 32-bit linear addresses to 52-bit physical addresses. The kernel's restrictions on max physical addresses are limits on how much memory the kernel can reasonably use, not what physical addresses are supported by hardware. Fixes: ce88decffd17 ("KVM: MMU: mmio page fault support") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 59593aed7e9e95fd29564d10e477c2edc6e7e5bf Author: Wayne Lin Date: Thu Dec 5 17:00:43 2019 +0800 drm/dp_mst: Remove VCPI while disabling topology mgr [ Upstream commit 64e62bdf04ab8529f45ed0a85122c703035dec3a ] [Why] This patch is trying to address the issue observed when hotplug DP daisy chain monitors. e.g. src-mstb-mstb-sst -> src (unplug) mstb-mstb-sst -> src-mstb-mstb-sst (plug in again) Once unplug a DP MST capable device, driver will call drm_dp_mst_topology_mgr_set_mst() to disable MST. In this function, it cleans data of topology manager while disabling mst_state. However, it doesn't clean up the proposed_vcpis of topology manager. If proposed_vcpi is not reset, once plug in MST daisy chain monitors later, code will fail at checking port validation while trying to allocate payloads. When MST capable device is plugged in again and try to allocate payloads by calling drm_dp_update_payload_part1(), this function will iterate over all proposed virtual channels to see if any proposed VCPI's num_slots is greater than 0. If any proposed VCPI's num_slots is greater than 0 and the port which the specific virtual channel directed to is not in the topology, code then fails at the port validation. Since there are stale VCPI allocations from the previous topology enablement in proposed_vcpi[], code will fail at port validation and reurn EINVAL. [How] Clean up the data of stale proposed_vcpi[] and reset mgr->proposed_vcpis to NULL while disabling mst in drm_dp_mst_topology_mgr_set_mst(). Changes since v1: *Add on more details in commit message to describe the issue which the patch is trying to fix Signed-off-by: Wayne Lin [added cc to stable] Signed-off-by: Lyude Paul Link: https://patchwork.freedesktop.org/patch/msgid/20191205090043.7580-1-Wayne.Lin@amd.com Cc: # v3.17+ Signed-off-by: Sasha Levin commit 49874262571658c6cd5b36282ab7becf0a09bc07 Author: Josef Bacik Date: Tue Jan 21 09:17:06 2020 -0500 btrfs: free block groups after free'ing fs trees [ Upstream commit 4e19443da1941050b346f8fc4c368aa68413bc88 ] Sometimes when running generic/475 we would trip the WARN_ON(cache->reserved) check when free'ing the block groups on umount. This is because sometimes we don't commit the transaction because of IO errors and thus do not cleanup the tree logs until at umount time. These blocks are still reserved until they are cleaned up, but they aren't cleaned up until _after_ we do the free block groups work. Fix this by moving the free after free'ing the fs roots, that way all of the tree logs are cleaned up and we have a properly cleaned fs. A bunch of loops of generic/475 confirmed this fixes the problem. CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 26ca39ac5593efad602391a5659772fdc369ba9f Author: Anand Jain Date: Thu Oct 10 10:39:25 2019 +0800 btrfs: use bool argument in free_root_pointers() [ Upstream commit 4273eaff9b8d5e141113a5bdf9628c02acf3afe5 ] We don't need int argument bool shall do in free_root_pointers(). And rename the argument as it confused two people. Reviewed-by: Qu Wenruo Signed-off-by: Anand Jain Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit d0d327fe37cceb44db0ba9623018ddab9be22059 Author: Thomas Gleixner Date: Thu Jan 23 12:54:53 2020 +0100 x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode commit 979923871f69a4dc926658f9f9a1a4c1bde57552 upstream. Tony reported a boot regression caused by the recent workaround for systems which have a disabled (clock gate off) PIT. On his machine the kernel fails to initialize the PIT because apic_needs_pit() does not take into account whether the local APIC interrupt delivery mode will actually allow to setup and use the local APIC timer. This should be easy to reproduce with acpi=off on the command line which also disables HPET. Due to the way the PIT/HPET and APIC setup ordering works (APIC setup can require working PIT/HPET) the information is not available at the point where apic_needs_pit() makes this decision. To address this, split out the interrupt mode selection from apic_intr_mode_init(), invoke the selection before making the decision whether PIT is required or not, and add the missing checks into apic_needs_pit(). Fixes: c8c4076723da ("x86/timer: Skip PIT initialization on modern chipsets") Reported-by: Anthony Buckley Tested-by: Anthony Buckley Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Cc: Daniel Drake Link: https://bugzilla.kernel.org/show_bug.cgi?id=206125 Link: https://lore.kernel.org/r/87sgk6tmk2.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman commit 8fbabd15bdbcef5fa12a9eeb72cab381f2010b6f Author: Matti Vaittinen Date: Mon Jan 20 15:45:11 2020 +0200 mfd: bd70528: Fix hour register mask commit 6c883472e1c11cb05561b6dd0c28bb037c2bf2de upstream. When RTC is used in 24H mode (and it is by this driver) the maximum hour value is 24 in BCD. This occupies bits [5:0] - which means correct mask for HOUR register is 0x3f not 0x1f. Fix the mask Fixes: 32a4a4ebf768 ("rtc: bd70528: Initial support for ROHM bd70528 RTC") Signed-off-by: Matti Vaittinen Acked-by: Alexandre Belloni Signed-off-by: Lee Jones Signed-off-by: Greg Kroah-Hartman commit 555b3025e8fa7d7f28d2c550d8ac6bb1b8bd478b Author: Andreas Kemnade Date: Fri Jan 17 22:59:22 2020 +0100 mfd: rn5t618: Mark ADC control register volatile commit 2f3dc25c0118de03a00ddc88b61f7216854f534d upstream. There is a bit which gets cleared after conversion. Fixes: 9bb9e29c78f8 ("mfd: Add Ricoh RN5T618 PMIC core driver") Signed-off-by: Andreas Kemnade Signed-off-by: Lee Jones Signed-off-by: Greg Kroah-Hartman commit 3cf5733a2db7701c5231aaa73a702ada0c80c5e9 Author: Marco Felsch Date: Wed Jan 8 10:57:02 2020 +0100 mfd: da9062: Fix watchdog compatible string commit 1112ba02ff1190ca9c15a912f9269e54b46d2d82 upstream. The watchdog driver compatible is "dlg,da9062-watchdog" and not "dlg,da9062-wdt". Therefore the mfd-core can't populate the of_node and fwnode. As result the watchdog driver can't parse the devicetree. Fixes: 9b40b030c4ad ("mfd: da9062: Supply core driver") Signed-off-by: Marco Felsch Acked-by: Guenter Roeck Reviewed-by: Adam Thomson Signed-off-by: Lee Jones Signed-off-by: Greg Kroah-Hartman commit 9af68afd832fe664eeca856942d0c45aa71db14e Author: Cezary Rojewski Date: Wed Jan 22 19:12:54 2020 +0100 ASoC: Intel: skl_hda_dsp_common: Fix global-out-of-bounds bug commit 15adb20f64c302b31e10ad50f22bb224052ce1df upstream. Definitions for idisp snd_soc_dai_links within skl_hda_dsp_common are missing platform component. Add it to address following bug reported by KASAN: [ 10.538502] BUG: KASAN: global-out-of-bounds in skl_hda_audio_probe+0x13a/0x2b0 [snd_soc_skl_hda_dsp] [ 10.538509] Write of size 8 at addr ffffffffc0606840 by task systemd-udevd/299 (...) [ 10.538519] Call Trace: [ 10.538524] dump_stack+0x62/0x95 [ 10.538528] print_address_description+0x2f5/0x3b0 [ 10.538532] ? skl_hda_audio_probe+0x13a/0x2b0 [snd_soc_skl_hda_dsp] [ 10.538535] __kasan_report+0x134/0x191 [ 10.538538] ? skl_hda_audio_probe+0x13a/0x2b0 [snd_soc_skl_hda_dsp] [ 10.538542] ? skl_hda_audio_probe+0x13a/0x2b0 [snd_soc_skl_hda_dsp] [ 10.538544] kasan_report+0x12/0x20 [ 10.538546] __asan_store8+0x57/0x90 [ 10.538550] skl_hda_audio_probe+0x13a/0x2b0 [snd_soc_skl_hda_dsp] [ 10.538553] platform_drv_probe+0x51/0xb0 [ 10.538556] really_probe+0x311/0x600 [ 10.538559] driver_probe_device+0x87/0x1b0 [ 10.538562] device_driver_attach+0x8f/0xa0 [ 10.538565] ? device_driver_attach+0xa0/0xa0 [ 10.538567] __driver_attach+0x102/0x1a0 [ 10.538569] ? device_driver_attach+0xa0/0xa0 [ 10.538572] bus_for_each_dev+0xe8/0x160 [ 10.538574] ? subsys_dev_iter_exit+0x10/0x10 [ 10.538577] ? preempt_count_sub+0x18/0xc0 [ 10.538580] ? _raw_write_unlock+0x1f/0x40 [ 10.538582] driver_attach+0x2b/0x30 [ 10.538585] bus_add_driver+0x251/0x340 [ 10.538588] driver_register+0xd3/0x1c0 [ 10.538590] __platform_driver_register+0x6c/0x80 [ 10.538592] ? 0xffffffffc03e8000 [ 10.538595] skl_hda_audio_init+0x1c/0x1000 [snd_soc_skl_hda_dsp] [ 10.538598] do_one_initcall+0xd0/0x36a [ 10.538600] ? trace_event_raw_event_initcall_finish+0x160/0x160 [ 10.538602] ? kasan_unpoison_shadow+0x36/0x50 [ 10.538605] ? __kasan_kmalloc+0xcc/0xe0 [ 10.538607] ? kasan_unpoison_shadow+0x36/0x50 [ 10.538609] ? kasan_poison_shadow+0x2f/0x40 [ 10.538612] ? __asan_register_globals+0x65/0x80 [ 10.538615] do_init_module+0xf9/0x36f [ 10.538619] load_module+0x398e/0x4590 [ 10.538625] ? module_frob_arch_sections+0x20/0x20 [ 10.538628] ? __kasan_check_write+0x14/0x20 [ 10.538630] ? kernel_read+0x9a/0xc0 [ 10.538632] ? __kasan_check_write+0x14/0x20 [ 10.538634] ? kernel_read_file+0x1d3/0x3c0 [ 10.538638] ? cap_capable+0xca/0x110 [ 10.538642] __do_sys_finit_module+0x190/0x1d0 [ 10.538644] ? __do_sys_finit_module+0x190/0x1d0 [ 10.538646] ? __x64_sys_init_module+0x50/0x50 [ 10.538649] ? expand_files+0x380/0x380 [ 10.538652] ? __kasan_check_write+0x14/0x20 [ 10.538654] ? fput_many+0x20/0xc0 [ 10.538658] __x64_sys_finit_module+0x43/0x50 [ 10.538660] do_syscall_64+0xce/0x700 [ 10.538662] ? syscall_return_slowpath+0x230/0x230 [ 10.538665] ? __do_page_fault+0x51e/0x640 [ 10.538668] ? __kasan_check_read+0x11/0x20 [ 10.538670] ? prepare_exit_to_usermode+0xc7/0x200 [ 10.538673] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: a78959f407e6 ("ASoC: Intel: skl_hda_dsp_common: use modern dai_link style") Signed-off-by: Cezary Rojewski Reviewed-by: Kai Vehmanen Link: https://lore.kernel.org/r/20200122181254.22801-1-cezary.rojewski@intel.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 207014751cd1890169bedfcebdabddd08951b92c Author: Tariq Toukan Date: Mon Jan 27 14:18:14 2020 +0200 net/mlx5: Deprecate usage of generic TLS HW capability bit [ Upstream commit 61c00cca41aeeaa8e5263c2f81f28534bc1efafb ] Deprecate the generic TLS cap bit, use the new TX-specific TLS cap bit instead. Fixes: a12ff35e0fb7 ("net/mlx5: Introduce TLS TX offload hardware bits and structures") Signed-off-by: Tariq Toukan Reviewed-by: Eran Ben Elisha Signed-off-by: Saeed Mahameed Signed-off-by: Greg Kroah-Hartman commit 70b68add8d65c2745d46cd71693536f4a1b8404a Author: Maor Gottlieb Date: Mon Jan 27 09:27:51 2020 +0200 net/mlx5: Fix deadlock in fs_core [ Upstream commit c1948390d78b5183ee9b7dd831efd7f6ac496ab0 ] free_match_list could be called when the flow table is already locked. We need to pass this notation to tree_put_node. It fixes the following lockdep warnning: [ 1797.268537] ============================================ [ 1797.276837] WARNING: possible recursive locking detected [ 1797.285101] 5.5.0-rc5+ #10 Not tainted [ 1797.291641] -------------------------------------------- [ 1797.299917] handler10/9296 is trying to acquire lock: [ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at: tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.319694] [ 1797.319694] but task is already holding lock: [ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.344707] [ 1797.344707] other info that might help us debug this: [ 1797.356952] Possible unsafe locking scenario: [ 1797.356952] [ 1797.368333] CPU0 [ 1797.373357] ---- [ 1797.378364] lock(&node->lock); [ 1797.384222] lock(&node->lock); [ 1797.390031] [ 1797.390031] *** DEADLOCK *** [ 1797.390031] [ 1797.403003] May be due to missing lock nesting notation [ 1797.403003] [ 1797.414691] 3 locks held by handler10/9296: [ 1797.421465] #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at: tc_setup_cb_add+0x70/0x250 [ 1797.432810] #1: ffff88a030081490 (&comp->sem){++++}, at: mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core] [ 1797.445829] #2: ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.459913] [ 1797.459913] stack backtrace: [ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not tainted 5.5.0-rc5+ #10 [ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017 [ 1797.491480] Call Trace: [ 1797.496701] dump_stack+0x96/0xe0 [ 1797.502864] __lock_acquire.cold.63+0xf8/0x212 [ 1797.510301] ? lockdep_hardirqs_on+0x250/0x250 [ 1797.517701] ? mark_held_locks+0x55/0xa0 [ 1797.524547] ? quarantine_put+0xb7/0x160 [ 1797.531422] ? lockdep_hardirqs_on+0x17d/0x250 [ 1797.538913] lock_acquire+0xd6/0x1f0 [ 1797.545529] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.553701] down_write+0x94/0x140 [ 1797.560206] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.568464] ? down_write_killable_nested+0x170/0x170 [ 1797.576925] ? del_hw_flow_group+0xde/0x1f0 [mlx5_core] [ 1797.585629] tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.593891] ? free_match_list.part.25+0x147/0x170 [mlx5_core] [ 1797.603389] free_match_list.part.25+0xe0/0x170 [mlx5_core] [ 1797.612654] _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core] [ 1797.621838] ? lock_acquire+0xd6/0x1f0 [ 1797.629028] ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core] [ 1797.637981] ? alloc_insert_flow_group+0x420/0x420 [mlx5_core] [ 1797.647459] ? try_to_wake_up+0x4c7/0xc70 [ 1797.654881] ? lock_downgrade+0x350/0x350 [ 1797.662271] ? __mutex_unlock_slowpath+0xb1/0x3f0 [ 1797.670396] ? find_held_lock+0xac/0xd0 [ 1797.677540] ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.686467] mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.695134] ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core] [ 1797.704270] ? irq_exit+0xa5/0x170 [ 1797.710764] ? retint_kernel+0x10/0x10 [ 1797.717698] ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230 [mlx5_core] [ 1797.728708] mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core] [ 1797.738713] ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core] [ 1797.748384] ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core] [ 1797.757400] mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core] [ 1797.767665] mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core] [ 1797.776886] ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core] [ 1797.785562] ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core] [ 1797.795353] __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core] [ 1797.804558] ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0 [mlx5_core] [ 1797.815093] ? wait_for_completion+0x260/0x260 [ 1797.823272] mlx5e_configure_flower+0xe94/0x1620 [mlx5_core] [ 1797.832792] ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core] [ 1797.842096] ? down_read+0x11a/0x2e0 [ 1797.849090] ? down_write+0x140/0x140 [ 1797.856142] ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core] [ 1797.866027] tc_setup_cb_add+0x11a/0x250 [ 1797.873339] fl_hw_replace_filter+0x25e/0x320 [cls_flower] [ 1797.882385] ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower] [ 1797.891607] fl_change+0x1d54/0x1fb6 [cls_flower] [ 1797.899772] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.910728] ? lock_downgrade+0x350/0x350 [ 1797.918187] ? __radix_tree_lookup+0xa5/0x130 [ 1797.926046] ? fl_set_key+0x1590/0x1590 [cls_flower] [ 1797.934611] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.945673] tc_new_tfilter+0xcd1/0x1240 [ 1797.953138] ? tc_del_tfilter+0xb10/0xb10 [ 1797.960688] ? avc_has_perm_noaudit+0x92/0x320 [ 1797.968721] ? avc_has_perm_noaudit+0x1df/0x320 [ 1797.976816] ? avc_has_extended_perms+0x990/0x990 [ 1797.985090] ? mark_lock+0xaa/0x9e0 [ 1797.991988] ? match_held_lock+0x1b/0x240 [ 1797.999457] ? match_held_lock+0x1b/0x240 [ 1798.006859] ? find_held_lock+0xac/0xd0 [ 1798.014045] ? symbol_put_addr+0x40/0x40 [ 1798.021317] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.029460] ? tc_del_tfilter+0xb10/0xb10 [ 1798.036810] rtnetlink_rcv_msg+0x4d5/0x620 [ 1798.044236] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.052034] ? lockdep_hardirqs_on+0x250/0x250 [ 1798.059837] ? match_held_lock+0x1b/0x240 [ 1798.067146] ? find_held_lock+0xac/0xd0 [ 1798.074246] netlink_rcv_skb+0xc6/0x1f0 [ 1798.081339] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.089104] ? netlink_ack+0x440/0x440 [ 1798.096061] netlink_unicast+0x2d4/0x3b0 [ 1798.103189] ? netlink_attachskb+0x3f0/0x3f0 [ 1798.110724] ? _copy_from_iter_full+0xda/0x370 [ 1798.118415] netlink_sendmsg+0x3ba/0x6a0 [ 1798.125478] ? netlink_unicast+0x3b0/0x3b0 [ 1798.132705] ? netlink_unicast+0x3b0/0x3b0 [ 1798.139880] sock_sendmsg+0x94/0xa0 [ 1798.146332] ____sys_sendmsg+0x36c/0x3f0 [ 1798.153251] ? copy_msghdr_from_user+0x165/0x230 [ 1798.160941] ? kernel_sendmsg+0x30/0x30 [ 1798.167738] ___sys_sendmsg+0xeb/0x150 [ 1798.174411] ? sendmsg_copy_msghdr+0x30/0x30 [ 1798.181649] ? lock_downgrade+0x350/0x350 [ 1798.188559] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.196239] ? __fget+0x21d/0x320 [ 1798.202335] ? do_dup2+0x2a0/0x2a0 [ 1798.208499] ? lock_downgrade+0x350/0x350 [ 1798.215366] ? __fget_light+0xd6/0xf0 [ 1798.221808] ? syscall_trace_enter+0x369/0x5d0 [ 1798.229112] __sys_sendmsg+0xd3/0x160 [ 1798.235511] ? __sys_sendmsg_sock+0x60/0x60 [ 1798.242478] ? syscall_trace_enter+0x233/0x5d0 [ 1798.249721] ? syscall_slow_exit_work+0x280/0x280 [ 1798.257211] ? do_syscall_64+0x1e/0x2e0 [ 1798.263680] do_syscall_64+0x72/0x2e0 [ 1798.269950] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Maor Gottlieb Signed-off-by: Alaa Hleihel Reviewed-by: Mark Bloch Signed-off-by: Saeed Mahameed Signed-off-by: Greg Kroah-Hartman commit 0fea83e06f12fdaefe796f7e257bf2bf80690b35 Author: Ido Schimmel Date: Fri Feb 7 19:29:28 2020 +0200 drop_monitor: Do not cancel uninitialized work item [ Upstream commit dfa7f709596be5ca46c070d4f8acbb344322056a ] Drop monitor uses a work item that takes care of constructing and sending netlink notifications to user space. In case drop monitor never started to monitor, then the work item is uninitialized and not associated with a function. Therefore, a stop command from user space results in canceling an uninitialized work item which leads to the following warning [1]. Fix this by not processing a stop command if drop monitor is not currently monitoring. [1] [ 31.735402] ------------[ cut here ]------------ [ 31.736470] WARNING: CPU: 0 PID: 143 at kernel/workqueue.c:3032 __flush_work+0x89f/0x9f0 ... [ 31.738120] CPU: 0 PID: 143 Comm: dwdump Not tainted 5.5.0-custom-09491-g16d4077796b8 #727 [ 31.741968] RIP: 0010:__flush_work+0x89f/0x9f0 ... [ 31.760526] Call Trace: [ 31.771689] __cancel_work_timer+0x2a6/0x3b0 [ 31.776809] net_dm_cmd_trace+0x300/0xef0 [ 31.777549] genl_rcv_msg+0x5c6/0xd50 [ 31.781005] netlink_rcv_skb+0x13b/0x3a0 [ 31.784114] genl_rcv+0x29/0x40 [ 31.784720] netlink_unicast+0x49f/0x6a0 [ 31.787148] netlink_sendmsg+0x7cf/0xc80 [ 31.790426] ____sys_sendmsg+0x620/0x770 [ 31.793458] ___sys_sendmsg+0xfd/0x170 [ 31.802216] __sys_sendmsg+0xdf/0x1a0 [ 31.806195] do_syscall_64+0xa0/0x540 [ 31.806885] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 8e94c3bc922e ("drop_monitor: Allow user to start monitoring hardware drops") Signed-off-by: Ido Schimmel Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2b2de489c82381a27e9a1f4fe3d586e000d2bacc Author: Sudarsana Reddy Kalluru Date: Wed Feb 5 05:10:55 2020 -0800 qed: Fix timestamping issue for L2 unicast ptp packets. [ Upstream commit 0202d293c2faecba791ba4afc5aec086249c393d ] commit cedeac9df4b8 ("qed: Add support for Timestamping the unicast PTP packets.") handles the timestamping of L4 ptp packets only. This patch adds driver changes to detect/timestamp both L2/L4 unicast PTP packets. Fixes: cedeac9df4b8 ("qed: Add support for Timestamping the unicast PTP packets.") Signed-off-by: Sudarsana Reddy Kalluru Signed-off-by: Ariel Elior Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 036ecba2eb995c952dca3700393b9381b4f27937 Author: Eric Dumazet Date: Fri Feb 7 07:16:37 2020 -0800 ipv6/addrconf: fix potential NULL deref in inet6_set_link_af() [ Upstream commit db3fa271022dacb9f741b96ea4714461a8911bb9 ] __in6_dev_get(dev) called from inet6_set_link_af() can return NULL. The needed check has been recently removed, let's add it back. While do_setlink() does call validate_linkmsg() : ... err = validate_linkmsg(dev, tb); /* OK at this point */ ... It is possible that the following call happening before the ->set_link_af() removes IPv6 if MTU is less than 1280 : if (tb[IFLA_MTU]) { err = dev_set_mtu_ext(dev, nla_get_u32(tb[IFLA_MTU]), extack); if (err < 0) goto errout; status |= DO_SETLINK_MODIFIED; } ... if (tb[IFLA_AF_SPEC]) { ... err = af_ops->set_link_af(dev, af); ->inet6_set_link_af() // CRASH because idev is NULL Please note that IPv4 is immune to the bug since inet_set_link_af() does : struct in_device *in_dev = __in_dev_get_rcu(dev); if (!in_dev) return -EAFNOSUPPORT; This problem has been mentioned in commit cf7afbfeb8ce ("rtnl: make link af-specific updates atomic") changelog : This method is not fail proof, while it is currently sufficient to make set_link_af() inerrable and thus 100% atomic, the validation function method will not be able to detect all error scenarios in the future, there will likely always be errors depending on states which are f.e. not protected by rtnl_mutex and thus may change between validation and setting. IPv6: ADDRCONF(NETDEV_CHANGE): lo: link becomes ready general protection fault, probably for non-canonical address 0xdffffc0000000056: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000002b0-0x00000000000002b7] CPU: 0 PID: 9698 Comm: syz-executor712 Not tainted 5.5.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:inet6_set_link_af+0x66e/0xae0 net/ipv6/addrconf.c:5733 Code: 38 d0 7f 08 84 c0 0f 85 20 03 00 00 48 8d bb b0 02 00 00 45 0f b6 64 24 04 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 44 89 a3 b0 02 00 RSP: 0018:ffffc90005b06d40 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff86df39a6 RDX: 0000000000000056 RSI: ffffffff86df3e74 RDI: 00000000000002b0 RBP: ffffc90005b06e70 R08: ffff8880a2ac0380 R09: ffffc90005b06db0 R10: fffff52000b60dbe R11: ffffc90005b06df7 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8880a1fcc424 R15: dffffc0000000000 FS: 0000000000c46880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f0494ca0d0 CR3: 000000009e4ac000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: do_setlink+0x2a9f/0x3720 net/core/rtnetlink.c:2754 rtnl_group_changelink net/core/rtnetlink.c:3103 [inline] __rtnl_newlink+0xdd1/0x1790 net/core/rtnetlink.c:3257 rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3377 rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:672 ____sys_sendmsg+0x753/0x880 net/socket.c:2343 ___sys_sendmsg+0x100/0x170 net/socket.c:2397 __sys_sendmsg+0x105/0x1d0 net/socket.c:2430 __do_sys_sendmsg net/socket.c:2439 [inline] __se_sys_sendmsg net/socket.c:2437 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4402e9 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffd62fbcf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004402e9 RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000008 R09: 00000000004002c8 R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000401b70 R13: 0000000000401c00 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace cfa7664b8fdcdff3 ]--- RIP: 0010:inet6_set_link_af+0x66e/0xae0 net/ipv6/addrconf.c:5733 Code: 38 d0 7f 08 84 c0 0f 85 20 03 00 00 48 8d bb b0 02 00 00 45 0f b6 64 24 04 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 44 89 a3 b0 02 00 RSP: 0018:ffffc90005b06d40 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff86df39a6 RDX: 0000000000000056 RSI: ffffffff86df3e74 RDI: 00000000000002b0 RBP: ffffc90005b06e70 R08: ffff8880a2ac0380 R09: ffffc90005b06db0 R10: fffff52000b60dbe R11: ffffc90005b06df7 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8880a1fcc424 R15: dffffc0000000000 FS: 0000000000c46880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000004 CR3: 000000009e4ac000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 7dc2bccab0ee ("Validate required parameters in inet6_validate_link_af") Signed-off-by: Eric Dumazet Bisected-and-reported-by: syzbot Cc: Maxim Mikityanskiy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7fd6c4da37d41d5e61e5e89f75493db1a0eaebcc Author: Vinicius Costa Gomes Date: Thu Feb 6 13:46:10 2020 -0800 taprio: Fix dropping packets when using taprio + ETF offloading [ Upstream commit bfabd41da34180d05382312533a3adc2e012dee0 ] When using taprio offloading together with ETF offloading, configured like this, for example: $ tc qdisc replace dev $IFACE parent root handle 100 taprio \ num_tc 4 \ map 2 2 1 0 3 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 1@2 1@3 \ base-time $BASE_TIME \ sched-entry S 01 1000000 \ sched-entry S 0e 1000000 \ flags 0x2 $ tc qdisc replace dev $IFACE parent 100:1 etf \ offload delta 300000 clockid CLOCK_TAI During enqueue, it works out that the verification added for the "txtime" assisted mode is run when using taprio + ETF offloading, the only thing missing is initializing the 'next_txtime' of all the cycle entries. (if we don't set 'next_txtime' all packets from SO_TXTIME sockets are dropped) Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Signed-off-by: Vinicius Costa Gomes Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a5b959885c45690baa4383581758030947b7b5f4 Author: Vinicius Costa Gomes Date: Thu Feb 6 13:46:09 2020 -0800 taprio: Use taprio_reset_tc() to reset Traffic Classes configuration [ Upstream commit 7c16680a08ee1e444a67d232c679ccf5b30fad16 ] When destroying the current taprio instance, which can happen when the creation of one fails, we should reset the traffic class configuration back to the default state. netdev_reset_tc() is a better way because in addition to setting the number of traffic classes to zero, it also resets the priority to traffic classes mapping to the default value. Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler") Signed-off-by: Vinicius Costa Gomes Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ee6adcf2c3186f86747520e836586fb0946e5157 Author: Vinicius Costa Gomes Date: Thu Feb 6 13:46:08 2020 -0800 taprio: Add missing policy validation for flags [ Upstream commit 49c684d79cfdc3032344bf6f3deeea81c4efedbf ] netlink policy validation for the 'flags' argument was missing. Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Signed-off-by: Vinicius Costa Gomes Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d544302521af8ea959b84e453d4e881022cd228c Author: Vinicius Costa Gomes Date: Thu Feb 6 13:46:07 2020 -0800 taprio: Fix still allowing changing the flags during runtime [ Upstream commit a9d6227436f32142209f4428f2dc616761485112 ] Because 'q->flags' starts as zero, and zero is a valid value, we aren't able to detect the transition from zero to something else during "runtime". The solution is to initialize 'q->flags' with an invalid value, so we can detect if 'q->flags' was set by the user or not. To better solidify the behavior, 'flags' handling is moved to a separate function. The behavior is: - 'flags' if unspecified by the user, is assumed to be zero; - 'flags' cannot change during "runtime" (i.e. a change() request cannot modify it); With this new function we can remove taprio_flags, which should reduce the risk of future accidents. Allowing flags to be changed was causing the following RCU stall: [ 1730.558249] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 1730.558258] rcu: 6-...0: (190 ticks this GP) idle=922/0/0x1 softirq=25580/25582 fqs=16250 [ 1730.558264] (detected by 2, t=65002 jiffies, g=33017, q=81) [ 1730.558269] Sending NMI from CPU 2 to CPUs 6: [ 1730.559277] NMI backtrace for cpu 6 [ 1730.559277] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G E 5.5.0-rc6+ #35 [ 1730.559278] Hardware name: Gigabyte Technology Co., Ltd. Z390 AORUS ULTRA/Z390 AORUS ULTRA-CF, BIOS F7 03/14/2019 [ 1730.559278] RIP: 0010:__hrtimer_run_queues+0xe2/0x440 [ 1730.559278] Code: 48 8b 43 28 4c 89 ff 48 8b 75 c0 48 89 45 c8 e8 f4 bb 7c 00 0f 1f 44 00 00 65 8b 05 40 31 f0 68 89 c0 48 0f a3 05 3e 5c 25 01 <0f> 82 fc 01 00 00 48 8b 45 c8 48 89 df ff d0 89 45 c8 0f 1f 44 00 [ 1730.559279] RSP: 0018:ffff9970802d8f10 EFLAGS: 00000083 [ 1730.559279] RAX: 0000000000000006 RBX: ffff8b31645bff38 RCX: 0000000000000000 [ 1730.559280] RDX: 0000000000000000 RSI: ffffffff9710f2ec RDI: ffffffff978daf0e [ 1730.559280] RBP: ffff9970802d8f68 R08: 0000000000000000 R09: 0000000000000000 [ 1730.559280] R10: 0000018336d7944e R11: 0000000000000001 R12: ffff8b316e39f9c0 [ 1730.559281] R13: ffff8b316e39f940 R14: ffff8b316e39f998 R15: ffff8b316e39f7c0 [ 1730.559281] FS: 0000000000000000(0000) GS:ffff8b316e380000(0000) knlGS:0000000000000000 [ 1730.559281] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1730.559281] CR2: 00007f1105303760 CR3: 0000000227210005 CR4: 00000000003606e0 [ 1730.559282] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1730.559282] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1730.559282] Call Trace: [ 1730.559282] [ 1730.559283] ? taprio_dequeue_soft+0x2d0/0x2d0 [sch_taprio] [ 1730.559283] hrtimer_interrupt+0x104/0x220 [ 1730.559283] ? irqtime_account_irq+0x34/0xa0 [ 1730.559283] smp_apic_timer_interrupt+0x6d/0x230 [ 1730.559284] apic_timer_interrupt+0xf/0x20 [ 1730.559284] [ 1730.559284] RIP: 0010:cpu_idle_poll+0x35/0x1a0 [ 1730.559285] Code: 88 82 ff 65 44 8b 25 12 7d 73 68 0f 1f 44 00 00 e8 90 c3 89 ff fb 65 48 8b 1c 25 c0 7e 01 00 48 8b 03 a8 08 74 0b eb 1c f3 90 <48> 8b 03 a8 08 75 13 8b 05 be a8 a8 00 85 c0 75 ed e8 75 48 84 ff [ 1730.559285] RSP: 0018:ffff997080137ea8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 [ 1730.559285] RAX: 0000000000000001 RBX: ffff8b316bc3c580 RCX: 0000000000000000 [ 1730.559286] RDX: 0000000000000001 RSI: 000000002819aad9 RDI: ffffffff978da730 [ 1730.559286] RBP: ffff997080137ec0 R08: 0000018324a6d387 R09: 0000000000000000 [ 1730.559286] R10: 0000000000000400 R11: 0000000000000001 R12: 0000000000000006 [ 1730.559286] R13: ffff8b316bc3c580 R14: 0000000000000000 R15: 0000000000000000 [ 1730.559287] ? cpu_idle_poll+0x20/0x1a0 [ 1730.559287] ? cpu_idle_poll+0x20/0x1a0 [ 1730.559287] do_idle+0x4d/0x1f0 [ 1730.559287] ? complete+0x44/0x50 [ 1730.559288] cpu_startup_entry+0x1b/0x20 [ 1730.559288] start_secondary+0x142/0x180 [ 1730.559288] secondary_startup_64+0xb6/0xc0 [ 1776.686313] nvme nvme0: I/O 96 QID 1 timeout, completion polled Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Signed-off-by: Vinicius Costa Gomes Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 47578c894979db4b6d7ef233df1d2747330e308c Author: Vinicius Costa Gomes Date: Thu Feb 6 13:46:06 2020 -0800 taprio: Fix enabling offload with wrong number of traffic classes [ Upstream commit 5652e63df3303c2a702bac25fbf710b9cb64dfba ] If the driver implementing taprio offloading depends on the value of the network device number of traffic classes (dev->num_tc) for whatever reason, it was going to receive the value zero. The value was only set after the offloading function is called. So, moving setting the number of traffic classes to before the offloading function is called fixes this issue. This is safe because this only happens when taprio is instantiated (we don't allow this configuration to be changed without first removing taprio). Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading") Reported-by: Po Liu Signed-off-by: Vinicius Costa Gomes Acked-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4c4153375b6737f76fc9163c77057f9c932dc3c7 Author: Harini Katakam Date: Wed Feb 5 18:08:12 2020 +0530 net: macb: Limit maximum GEM TX length in TSO [ Upstream commit f822e9c4ffa511a5c681cf866287d9383a3b6f1b ] GEM_MAX_TX_LEN currently resolves to 0x3FF8 for any IP version supporting TSO with full 14bits of length field in payload descriptor. But an IP errata causes false amba_error (bit 6 of ISR) when length in payload descriptors is specified above 16387. The error occurs because the DMA falsely concludes that there is not enough space in SRAM for incoming payload. These errors were observed continuously under stress of large packets using iperf on a version where SRAM was 16K for each queue. This errata will be documented shortly and affects all versions since TSO functionality was added. Hence limit the max length to 0x3FC0 (rounded). Signed-off-by: Harini Katakam Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 96ad794892e73db89c36cd5bdae051a51dbca16e Author: Harini Katakam Date: Wed Feb 5 18:08:11 2020 +0530 net: macb: Remove unnecessary alignment check for TSO [ Upstream commit 41c1ef978c8d0259c6636e6d2d854777e92650eb ] The IP TSO implementation does NOT require the length to be a multiple of 8. That is only a requirement for UFO as per IP documentation. Hence, exit macb_features_check function in the beginning if the protocol is not UDP. Only when it is UDP, proceed further to the alignment checks. Update comments to reflect the same. Also remove dead code checking for protocol TCP when calculating header length. Fixes: 1629dd4f763c ("cadence: Add LSO support.") Signed-off-by: Harini Katakam Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9211b26dcfee44a42ca29514435ca9323c63364e Author: Raed Salem Date: Wed Oct 23 16:41:21 2019 +0300 net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctx [ Upstream commit 08db2cf577487f5123aebcc2f913e0b8a2c14b43 ] SA context is allocated at mlx5_fpga_ipsec_create_sa_ctx, however the counterpart mlx5_fpga_ipsec_delete_sa_ctx function nullifies sa_ctx pointer without freeing the memory allocated, hence the memory leak. Fix by free SA context when the SA is released. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Raed Salem Reviewed-by: Boris Pismenny Signed-off-by: Saeed Mahameed Signed-off-by: Greg Kroah-Hartman commit 0be678f179e0e0b4727dccbefd1e6389bdadd16c Author: Raed Salem Date: Tue Dec 24 09:54:45 2019 +0200 net/mlx5: IPsec, Fix esp modify function attribute [ Upstream commit 0dc2c534f17c05bed0622b37a744bc38b48ca88a ] The function mlx5_fpga_esp_validate_xfrm_attrs is wrongly used with negative negation as zero value indicates success but it used as failure return value instead. Fix by remove the unary not negation operator. Fixes: 05564d0ae075 ("net/mlx5: Add flow-steering commands for FPGA IPSec implementation") Signed-off-by: Raed Salem Reviewed-by: Boris Pismenny Signed-off-by: Saeed Mahameed Signed-off-by: Greg Kroah-Hartman commit 74888191bb1392567f887264a0e994792d36982b Author: Florian Fainelli Date: Wed Feb 5 12:32:04 2020 -0800 net: systemport: Avoid RBUF stuck in Wake-on-LAN mode [ Upstream commit 263a425a482fc495d6d3f9a29b9103a664c38b69 ] After a number of suspend and resume cycles, it is possible for the RBUF to be stuck in Wake-on-LAN mode, despite the MPD enable bit being cleared which instructed the RBUF to exit that mode. Avoid creating that problematic condition by clearing the RX_EN and TX_EN bits in the UniMAC prior to disable the Magic Packet Detector logic which is guaranteed to make the RBUF exit Wake-on-LAN mode. Fixes: 83e82f4c706b ("net: systemport: add Wake-on-LAN support") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8526c3b6e753789e2f6a9f8985f8fe3da153c28e Author: Dejin Zheng Date: Thu Feb 6 23:29:17 2020 +0800 net: stmmac: fix a possible endless loop [ Upstream commit 7d10f0774f9e32aa2f2e012f7fcb312a2ce422b9 ] It forgot to reduce the value of the variable retry in a while loop in the ethqos_configure() function. It may cause an endless loop and without timeout. Fixes: a7c30e62d4b8 ("net: stmmac: Add driver for Qualcomm ethqos") Signed-off-by: Dejin Zheng Acked-by: Vinod Koul Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ebf9cdfbcd38593575f72bfb31dc76e53bd3c6d6 Author: Cong Wang Date: Tue Feb 4 11:10:12 2020 -0800 net_sched: fix a resource leak in tcindex_set_parms() [ Upstream commit 52b5ae501c045010aeeb1d5ac0373ff161a88291 ] Jakub noticed there is a potential resource leak in tcindex_set_parms(): when tcindex_filter_result_init() fails and it jumps to 'errout1' which doesn't release the memory and resources allocated by tcindex_alloc_perfect_hash(). We should just jump to 'errout_alloc' which calls tcindex_free_perfect_hash(). Fixes: b9a24bb76bf6 ("net_sched: properly handle failure case of tcf_exts_init()") Reported-by: Jakub Kicinski Cc: Jamal Hadi Salim Cc: Jiri Pirko Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6994d92ed59ae655c95a39362eabf396ffbaa5be Author: Lorenzo Bianconi Date: Thu Feb 6 10:14:39 2020 +0100 net: mvneta: move rx_dropped and rx_errors in per-cpu stats [ Upstream commit c35947b8ff8acca33134ee39c31708233765c31a ] Move rx_dropped and rx_errors counters in mvneta_pcpu_stats in order to avoid possible races updating statistics Fixes: 562e2f467e71 ("net: mvneta: Improve the buffer allocation method for SWBM") Fixes: dc35a10f68d3 ("net: mvneta: bm: add support for hardware buffer management") Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: Lorenzo Bianconi Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 76e828ceafee93bce89f72399960f8f06501e4f8 Author: Razvan Stefanescu Date: Fri Feb 7 17:44:04 2020 +0200 net: dsa: microchip: enable module autoprobe [ Upstream commit f8c2afa66d5397b0b9293c4347dac6dabb327685 ] This matches /sys/devices/.../spi1.0/modalias content. Fixes: 9b2d9f05cddf ("net: dsa: microchip: add ksz9567 to ksz9477 driver") Fixes: d9033ae95cf4 ("net: dsa: microchip: add KSZ8563 compatibility string") Fixes: 8c29bebb1f8a ("net: dsa: microchip: add KSZ9893 switch support") Fixes: 45316818371d ("net: dsa: add support for ksz9897 ethernet switch") Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477") Signed-off-by: Razvan Stefanescu Signed-off-by: Codrin Ciubotariu Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 07d7fcb1dd2960c6393a9cbce8323572fc45a5a2 Author: Florian Fainelli Date: Thu Feb 6 11:23:52 2020 -0800 net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port [ Upstream commit de34d7084edd069dac5aa010cfe32bd8c4619fa6 ] The 7445 switch clocking profiles do not allow us to run the IMP port at 2Gb/sec in a way that it is reliable and consistent. Make sure that the setting is only applied to the 7278 family. Fixes: 8f1880cbe8d0 ("net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 84e4db766fc6bb4d48f272ec2f5d3c30240c2e09 Author: Florian Fainelli Date: Thu Feb 6 11:07:45 2020 -0800 net: dsa: b53: Always use dev->vlan_enabled in b53_configure_vlan() [ Upstream commit df373702bc0f8f2d83980ea441e71639fc1efcf8 ] b53_configure_vlan() is called by the bcm_sf2 driver upon setup and indirectly through resume as well. During the initial setup, we are guaranteed that dev->vlan_enabled is false, so there is no change in behavior, however during suspend, we may have enabled VLANs before, so we do want to restore that setting. Fixes: dad8d7c6452b ("net: dsa: b53: Properly account for VLAN filtering") Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fddb5a50801af7808be42ae73deb81e2675439be Author: Madalin Bucur Date: Tue Feb 4 12:08:58 2020 +0200 dpaa_eth: support all modes with rate adapting PHYs [ Upstream commit 73a21fa817f0cc8022dc6226250a86bca727a56d ] Stop removing modes that are not supported on the system interface when the connected PHY is capable of rate adaptation. This addresses an issue with the LS1046ARDB board 10G interface no longer working with an 1G link partner after autonegotiation support was added for the Aquantia PHY on board in commit 09c4c57f7bc4 ("net: phy: aquantia: add support for auto-negotiation configuration") Before this commit the values advertised by the PHY were not influenced by the dpaa_eth driver removal of system-side unsupported modes as the aqr_config_aneg() was basically a no-op. After this commit, the modes removed by the dpaa_eth driver were no longer advertised thus autonegotiation with 1G link partners failed. Reported-by: Mian Yousaf Kaukab Signed-off-by: Madalin Bucur Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3f90dce11916a8dd398ee911fa5af5ea129ed853 Author: Jacob Keller Date: Tue Feb 4 15:59:50 2020 -0800 devlink: report 0 after hitting end in region read [ Upstream commit d5b90e99e1d51b7b5d2b74fbc4c2db236a510913 ] commit fdd41ec21e15 ("devlink: Return right error code in case of errors for region read") modified the region read code to report errors properly in unexpected cases. In the case where the start_offset and ret_offset match, it unilaterally converted this into an error. This causes an issue for the "dump" version of the command. In this case, the devlink region dump will always report an invalid argument: 000000000000ffd0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 000000000000ffe0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff devlink answers: Invalid argument 000000000000fff0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff This occurs because the expected flow for the dump is to return 0 after there is no further data. The simplest fix would be to stop converting the error code to -EINVAL if start_offset == ret_offset. However, avoid unnecessary work by checking for when start_offset is larger than the region size and returning 0 upfront. Fixes: fdd41ec21e15 ("devlink: Return right error code in case of errors for region read") Signed-off-by: Jacob Keller Acked-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6978e2935c74b0e9b51454fea5e8069ef392b3d1 Author: Eric Dumazet Date: Tue Feb 4 19:26:05 2020 -0800 bonding/alb: properly access headers in bond_alb_xmit() [ Upstream commit 38f88c45404293bbc027b956def6c10cbd45c616 ] syzbot managed to send an IPX packet through bond_alb_xmit() and af_packet and triggered a use-after-free. First, bond_alb_xmit() was using ipx_hdr() helper to reach the IPX header, but ipx_hdr() was using the transport offset instead of the network offset. In the particular syzbot report transport offset was 0xFFFF This patch removes ipx_hdr() since it was only (mis)used from bonding. Then we need to make sure IPv4/IPv6/IPX headers are pulled in skb->head before dereferencing anything. BUG: KASAN: use-after-free in bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452 Read of size 2 at addr ffff8801ce56dfff by task syz-executor.2/18108 (if (ipx_hdr(skb)->ipx_checksum != IPX_NO_CHECKSUM) ...) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: [] __dump_stack lib/dump_stack.c:17 [inline] [] dump_stack+0x14d/0x20b lib/dump_stack.c:53 [] print_address_description+0x6f/0x20b mm/kasan/report.c:282 [] kasan_report_error mm/kasan/report.c:380 [inline] [] kasan_report mm/kasan/report.c:438 [inline] [] kasan_report.cold+0x8c/0x2a0 mm/kasan/report.c:422 [] __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:469 [] bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452 [] __bond_start_xmit drivers/net/bonding/bond_main.c:4199 [inline] [] bond_start_xmit+0x4f4/0x1570 drivers/net/bonding/bond_main.c:4224 [] __netdev_start_xmit include/linux/netdevice.h:4525 [inline] [] netdev_start_xmit include/linux/netdevice.h:4539 [inline] [] xmit_one net/core/dev.c:3611 [inline] [] dev_hard_start_xmit+0x168/0x910 net/core/dev.c:3627 [] __dev_queue_xmit+0x1f55/0x33b0 net/core/dev.c:4238 [] dev_queue_xmit+0x18/0x20 net/core/dev.c:4278 [] packet_snd net/packet/af_packet.c:3226 [inline] [] packet_sendmsg+0x4919/0x70b0 net/packet/af_packet.c:3252 [] sock_sendmsg_nosec net/socket.c:673 [inline] [] sock_sendmsg+0x12c/0x160 net/socket.c:684 [] __sys_sendto+0x262/0x380 net/socket.c:1996 [] SYSC_sendto net/socket.c:2008 [inline] [] SyS_sendto+0x40/0x60 net/socket.c:2004 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Jay Vosburgh Cc: Veaceslav Falico Cc: Andy Gospodarek Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0a56a2e1624a289aff995f357bef48e5504b47ee Author: Marek Vasut Date: Fri Dec 20 17:44:50 2019 +0100 ASoC: sgtl5000: Fix VDDA and VDDIO comparison commit e19ecbf105b236a6334fab64d8fd5437b12ee019 upstream. Comparing the voltage of VDDA and VDDIO to determine whether or not to enable VDDC manual override is insufficient. This is a problem in case the VDDA is supplied from different regulator than VDDIO, while both report the same voltage to the regulator framework. In that case where VDDA and VDDIO is supplied by different regulators, the VDDC manual override must not be applied. Fixes: b6319b061ba2 ("ASoC: sgtl5000: Fix charge pump source assignment") Signed-off-by: Marek Vasut Cc: Fabio Estevam Cc: Igor Opaniuk Cc: Marcel Ziswiler Cc: Mark Brown Cc: Oleksandr Suvorov Link: https://lore.kernel.org/r/20191220164450.1395038-2-marex@denx.de Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 67d5b7a1f971b35e63e5c46bed03016f766e5abc Author: Marek Vasut Date: Fri Dec 20 17:44:49 2019 +0100 regulator: core: Add regulator_is_equal() helper commit b059b7e0ec3208ff1e17cff6387d75a9fbab4e02 upstream. Add regulator_is_equal() helper to compare whether two regulators are the same. This is useful for checking whether two separate regulators in a driver are actually the same supply. Signed-off-by: Marek Vasut Cc: Fabio Estevam Cc: Igor Opaniuk Cc: Liam Girdwood Cc: Marcel Ziswiler Cc: Mark Brown Cc: Oleksandr Suvorov Link: https://lore.kernel.org/r/20191220164450.1395038-1-marex@denx.de Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit c533cf50fdf6158bcad97568926c4089a76d68be Author: Quanyang Wang Date: Tue Jan 14 13:43:11 2020 +0800 ubifs: Fix memory leak from c->sup_node commit ff90bdfb206e49c8b418811efbdd0c77380fa8c2 upstream. The c->sup_node is allocated in function ubifs_read_sb_node but is not freed. This will cause memory leak as below: unreferenced object 0xbc9ce000 (size 4096): comm "mount", pid 500, jiffies 4294952946 (age 315.820s) hex dump (first 32 bytes): 31 18 10 06 06 7b f1 11 02 00 00 00 00 00 00 00 1....{.......... 00 10 00 00 06 00 00 00 00 00 00 00 08 00 00 00 ................ backtrace: [] ubifs_read_superblock+0x48/0xebc [] ubifs_mount+0x974/0x1420 [<8589ecc3>] legacy_get_tree+0x2c/0x50 [<5f1fb889>] vfs_get_tree+0x28/0xfc [] do_mount+0x4f8/0x748 [<4151f538>] ksys_mount+0x78/0xa0 [] ret_fast_syscall+0x0/0x54 [<1cc40005>] 0x7ea02790 Free it in ubifs_umount and in the error path of mount_ubifs. Fixes: fd6150051bec ("ubifs: Store read superblock node") Signed-off-by: Quanyang Wang Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit fa70d4f7f8e0149d9a6800c80ef703b3ed0d3d53 Author: Dan Carpenter Date: Mon Jan 13 16:23:46 2020 +0300 ubi: Fix an error pointer dereference in error handling code commit 5d3805af279c93ef49a64701f35254676d709622 upstream. If "seen_pebs = init_seen(ubi);" fails then "seen_pebs" is an error pointer and we try to kfree() it which results in an Oops. This patch re-arranges the error handling so now it only frees things which have been allocated successfully. Fixes: daef3dd1f0ae ("UBI: Fastmap: Add self check to detect absent PEBs") Signed-off-by: Dan Carpenter Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 6f362620793be78637d18c862d960248cad17eb2 Author: Sascha Hauer Date: Wed Oct 23 11:58:12 2019 +0200 ubi: fastmap: Fix inverted logic in seen selfcheck commit ef5aafb6e4e9942a28cd300bdcda21ce6cbaf045 upstream. set_seen() sets the bit corresponding to the PEB number in the bitmap, so when self_check_seen() wants to find PEBs that haven't been seen we have to print the PEBs that have their bit cleared, not the ones which have it set. Fixes: 5d71afb00840 ("ubi: Use bitmaps in Fastmap self-check code") Signed-off-by: Sascha Hauer Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit c6d07f6e50e47d439fd25fa0ae5382588636c38b Author: David Hildenbrand Date: Wed Feb 5 17:34:01 2020 +0100 virtio_balloon: Fix memory leaks on errors in virtballoon_probe() commit 1ad6f58ea9364b0a5d8ae06249653ac9304a8578 upstream. We forget to put the inode and unmount the kernfs used for compaction. Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") Cc: "Michael S. Tsirkin" Cc: Jason Wang Cc: Wei Wang Cc: Liang Li Signed-off-by: David Hildenbrand Link: https://lore.kernel.org/r/20200205163402.42627-3-david@redhat.com Signed-off-by: Michael S. Tsirkin Signed-off-by: Greg Kroah-Hartman commit 3ac13462f55add67754cee52e3e9bffde040894d Author: David Hildenbrand Date: Wed Feb 5 17:34:00 2020 +0100 virtio-balloon: Fix memory leak when unloading while hinting is in progress commit 6c22dc61c76b7e7d355f1697ba0ecf26d1334ba6 upstream. When unloading the driver while hinting is in progress, we will not release the free page blocks back to MM, resulting in a memory leak. Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Cc: "Michael S. Tsirkin" Cc: Jason Wang Cc: Wei Wang Cc: Liang Li Signed-off-by: David Hildenbrand Link: https://lore.kernel.org/r/20200205163402.42627-2-david@redhat.com Signed-off-by: Michael S. Tsirkin Signed-off-by: Greg Kroah-Hartman commit 7eece787ed821b073326f273d5abdb6be2649c94 Author: Trond Myklebust Date: Tue Dec 17 12:33:33 2019 -0500 nfsd: Return the correct number of bytes written to the file commit 09a80f2aef06b7c86143f5c14efd3485e0d2c139 upstream. We must allow for the fact that iov_iter_write() could have returned a short write (e.g. if there was an ENOSPC issue). Fixes: d890be159a71 "nfsd: Add I/O trace points in the NFSv4 write path" Signed-off-by: Trond Myklebust Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit e94829641e6995d04cdd13765bfa0723abf9d906 Author: Arnd Bergmann Date: Mon Nov 4 14:43:17 2019 +0100 nfsd: fix jiffies/time_t mixup in LRU list commit 9594497f2c78993cb66b696122f7c65528ace985 upstream. The nfsd4_blocked_lock->nbl_time timestamp is recorded in jiffies, but then compared to a CLOCK_REALTIME timestamp later on, which makes no sense. For consistency with the other timestamps, change this to use a time_t. This is a change in behavior, which may cause regressions, but the current code is not sensible. On a system with CONFIG_HZ=1000, the 'time_after((unsigned long)nbl->nbl_time, (unsigned long)cutoff))' check is false for roughly the first 18 days of uptime and then true for the next 49 days. Fixes: 7919d0a27f1e ("nfsd: add a LRU list for blocked locks") Signed-off-by: Arnd Bergmann Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit 9f3fa8bea96d07a1a83176947493e20346422633 Author: Arnd Bergmann Date: Sun Nov 3 22:32:20 2019 +0100 nfsd: fix delay timer on 32-bit architectures commit 2561c92b12f4f4e386d453556685f75775c0938b upstream. The nfsd4_cb_layout_done() function takes a 'time_t' value, multiplied by NSEC_PER_SEC*2 to get a nanosecond value. This works fine on 64-bit architectures, but on 32-bit, any value over 1 second results in a signed integer overflow with unexpected results. Cast one input to a 64-bit type in order to produce the same result that we have on 64-bit architectures, regarless of the type of nfsd4_lease. Fixes: 6b9b21073d3b ("nfsd: give up on CB_LAYOUTRECALLs after two lease periods") Signed-off-by: Arnd Bergmann Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit 0d1dacfda0e54da37bb7dacafff4002605c91e8d Author: Yishai Hadas Date: Sun Dec 22 14:46:48 2019 +0200 IB/core: Fix ODP get user pages flow commit d07de8bd1709a80a282963ad7b2535148678a9e4 upstream. The nr_pages argument of get_user_pages_remote() should always be in terms of the system page size, not the MR page size. Use PAGE_SIZE instead of umem_odp->page_shift. Fixes: 403cd12e2cf7 ("IB/umem: Add contiguous ODP support") Link: https://lore.kernel.org/r/20191222124649.52300-3-leon@kernel.org Signed-off-by: Yishai Hadas Reviewed-by: Artemy Kovalyov Reviewed-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit 320a24fae28097cb54140182212e713e1922e069 Author: Prabhath Sajeepa Date: Thu Dec 12 17:11:29 2019 -0700 IB/mlx5: Fix outstanding_pi index for GSI qps commit b5671afe5e39ed71e94eae788bacdcceec69db09 upstream. Commit b0ffeb537f3a ("IB/mlx5: Fix iteration overrun in GSI qps") changed the way outstanding WRs are tracked for the GSI QP. But the fix did not cover the case when a call to ib_post_send() fails and updates index to track outstanding. Since the prior commmit outstanding_pi should not be bounded otherwise the loop generate_completions() will fail. Fixes: b0ffeb537f3a ("IB/mlx5: Fix iteration overrun in GSI qps") Link: https://lore.kernel.org/r/1576195889-23527-1-git-send-email-psajeepa@purestorage.com Signed-off-by: Prabhath Sajeepa Acked-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe Signed-off-by: Greg Kroah-Hartman commit b4d104ce6dfeaf1a9914561cdc43bc4eab7ea7c8 Author: Nathan Chancellor Date: Mon Dec 9 14:16:23 2019 -0700 net: tulip: Adjust indentation in {dmfe, uli526x}_init_module commit fe06bf3d83ef0d92f35a24e03297172e92ce9ce3 upstream. Clang warns: ../drivers/net/ethernet/dec/tulip/uli526x.c:1812:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] switch (mode) { ^ ../drivers/net/ethernet/dec/tulip/uli526x.c:1809:2: note: previous statement is here if (cr6set) ^ 1 warning generated. ../drivers/net/ethernet/dec/tulip/dmfe.c:2217:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] switch(mode) { ^ ../drivers/net/ethernet/dec/tulip/dmfe.c:2214:2: note: previous statement is here if (cr6set) ^ 1 warning generated. This warning occurs because there is a space before the tab on these lines. Remove them so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. While we are here, adjust the default block in dmfe_init_module to have a proper break between the label and assignment and add a space between the switch and opening parentheses to avoid a checkpatch warning. Fixes: e1c3e5014040 ("[PATCH] initialisation cleanup for ULI526x-net-driver") Link: https://github.com/ClangBuiltLinux/linux/issues/795 Signed-off-by: Nathan Chancellor Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9b14bd934ce30882da5a45b99daff011363f8aca Author: Nathan Chancellor Date: Mon Dec 9 14:50:27 2019 -0700 net: smc911x: Adjust indentation in smc911x_phy_configure commit 5c61e223004b3b5c3f1dd25718e979bc17a3b12d upstream. Clang warns: ../drivers/net/ethernet/smsc/smc911x.c:939:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] if (!lp->ctl_rfduplx) ^ ../drivers/net/ethernet/smsc/smc911x.c:936:2: note: previous statement is here if (lp->ctl_rspeed != 100) ^ 1 warning generated. This warning occurs because there is a space after the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: 0a0c72c9118c ("[PATCH] RE: [PATCH 1/1] net driver: Add support for SMSC LAN911x line of ethernet chips") Link: https://github.com/ClangBuiltLinux/linux/issues/796 Signed-off-by: Nathan Chancellor Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 761b514092467bdae6a665ead7cf2029241a85d7 Author: Nathan Chancellor Date: Mon Dec 9 15:38:59 2019 -0700 ppp: Adjust indentation into ppp_async_input commit 08cbc75f96029d3092664213a844a5e25523aa35 upstream. Clang warns: ../drivers/net/ppp/ppp_async.c:877:6: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] ap->rpkt = skb; ^ ../drivers/net/ppp/ppp_async.c:875:5: note: previous statement is here if (!skb) ^ 1 warning generated. This warning occurs because there is a space before the tab on this line. Clean up this entire block's indentation so that it is consistent with the Linux kernel coding style and clang no longer warns. Fixes: 6722e78c9005 ("[PPP]: handle misaligned accesses") Link: https://github.com/ClangBuiltLinux/linux/issues/800 Signed-off-by: Nathan Chancellor Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bd1bac782d92ce2f869aec6fe49c283868eb47df Author: Nathan Chancellor Date: Tue Dec 17 18:21:52 2019 -0700 NFC: pn544: Adjust indentation in pn544_hci_check_presence commit 5080832627b65e3772a35d1dced68c64e2b24442 upstream. Clang warns ../drivers/nfc/pn544/pn544.c:696:4: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] return nfc_hci_send_cmd(hdev, NFC_HCI_RF_READER_A_GATE, ^ ../drivers/nfc/pn544/pn544.c:692:3: note: previous statement is here if (target->nfcid1_len != 4 && target->nfcid1_len != 7 && ^ 1 warning generated. This warning occurs because there is a space after the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: da052850b911 ("NFC: Add pn544 presence check for different targets") Link: https://github.com/ClangBuiltLinux/linux/issues/814 Signed-off-by: Nathan Chancellor Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit da535ca18ef5974eadd0329aa0a2a7df1de92dc3 Author: Nathan Chancellor Date: Mon Dec 9 13:32:30 2019 -0700 drm: msm: mdp4: Adjust indentation in mdp4_dsi_encoder_enable commit 251e3cb1418ff3f5061ee31335e346e852b16573 upstream. Clang warns: ../drivers/gpu/drm/msm/disp/mdp4/mdp4_dsi_encoder.c:124:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] mdp4_crtc_set_config(encoder->crtc, ^ ../drivers/gpu/drm/msm/disp/mdp4/mdp4_dsi_encoder.c:121:2: note: previous statement is here if (mdp4_dsi_encoder->enabled) ^ This warning occurs because there is a space after the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: 776638e73a19 ("drm/msm/dsi: Add a mdp4 encoder for DSI") Link: https://github.com/ClangBuiltLinux/linux/issues/792 Signed-off-by: Nathan Chancellor Reviewed-by: Nick Desaulniers Signed-off-by: Rob Clark Signed-off-by: Greg Kroah-Hartman commit bf45386cef7e8b5309a0191f273a7fa64a067273 Author: Nathan Chancellor Date: Mon Dec 9 13:03:38 2019 -0700 powerpc/44x: Adjust indentation in ibm4xx_denali_fixup_memsize commit c3aae14e5d468d18dbb5d7c0c8c7e2968cc14aad upstream. Clang warns: ../arch/powerpc/boot/4xx.c:231:3: warning: misleading indentation; statement is not part of the previous 'else' [-Wmisleading-indentation] val = SDRAM0_READ(DDR0_42); ^ ../arch/powerpc/boot/4xx.c:227:2: note: previous statement is here else ^ This is because there is a space at the beginning of this line; remove it so that the indentation is consistent according to the Linux kernel coding style and clang no longer warns. Fixes: d23f5099297c ("[POWERPC] 4xx: Adds decoding of 440SPE memory size to boot wrapper library") Signed-off-by: Nathan Chancellor Reviewed-by: Nick Desaulniers Signed-off-by: Michael Ellerman Link: https://github.com/ClangBuiltLinux/linux/issues/780 Link: https://lore.kernel.org/r/20191209200338.12546-1-natechancellor@gmail.com Signed-off-by: Greg Kroah-Hartman commit 0a7473b27eb9d6cb189acede6f6d61a1cf7c0072 Author: Nathan Chancellor Date: Tue Dec 17 20:19:31 2019 -0700 ext2: Adjust indentation in ext2_fill_super commit d9e9866803f7b6c3fdd35d345e97fb0b2908bbbc upstream. Clang warns: ../fs/ext2/super.c:1076:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] sbi->s_groups_count = ((le32_to_cpu(es->s_blocks_count) - ^ ../fs/ext2/super.c:1074:2: note: previous statement is here if (EXT2_BLOCKS_PER_GROUP(sb) == 0) ^ 1 warning generated. This warning occurs because there is a space before the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: 41f04d852e35 ("[PATCH] ext2: fix mounts at 16T") Link: https://github.com/ClangBuiltLinux/linux/issues/827 Link: https://lore.kernel.org/r/20191218031930.31393-1-natechancellor@gmail.com Signed-off-by: Nathan Chancellor Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 421b77ae26f47600ef3b97e5d05736c15a5939ba Author: Nathan Chancellor Date: Tue Dec 17 18:36:37 2019 -0700 phy: qualcomm: Adjust indentation in read_poll_timeout commit a89806c998ee123bb9c0f18526e55afd12c0c0ab upstream. Clang warns: ../drivers/phy/qualcomm/phy-qcom-apq8064-sata.c:83:4: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] usleep_range(DELAY_INTERVAL_US, DELAY_INTERVAL_US + 50); ^ ../drivers/phy/qualcomm/phy-qcom-apq8064-sata.c:80:3: note: previous statement is here if (readl_relaxed(addr) & mask) ^ 1 warning generated. This warning occurs because there is a space after the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: 1de990d8a169 ("phy: qcom: Add driver for QCOM APQ8064 SATA PHY") Link: https://github.com/ClangBuiltLinux/linux/issues/816 Signed-off-by: Nathan Chancellor Reviewed-by: Bjorn Andersson Signed-off-by: Kishon Vijay Abraham I Signed-off-by: Greg Kroah-Hartman commit 55a268cf341f2a19ab3c91871c5dd0f2129c77f2 Author: Vignesh Raghavendra Date: Thu Dec 5 12:29:33 2019 +0530 mtd: spi-nor: Split mt25qu512a (n25q512a) entry into two commit bd8a6e31b87b39a03ab11820776363640440dbe0 upstream. mt25q family is different from n25q family of devices, even though manf ID and device IDs are same. mt25q flash has bit 6 set in 5th byte of READ ID response which can be used to distinguish it from n25q variant. mt25q flashes support stateless 4 Byte addressing opcodes where as n25q flashes don't. Therefore, have two separate entries for mt25qu512a and n25q512a. Fixes: 9607af6f857f ("mtd: spi-nor: Rename "n25q512a" to "mt25qu512a (n25q512a)"") Signed-off-by: Vignesh Raghavendra Signed-off-by: Tudor Ambarus Signed-off-by: Greg Kroah-Hartman commit 3c9edf55817abad1ac54865a246b0aeb0bf66ad3 Author: Asutosh Das Date: Mon Nov 25 22:53:30 2019 -0800 scsi: ufs: Recheck bkops level if bkops is disabled commit 24366c2afbb0539fb14eff330d4e3a5db5c0a3ef upstream. bkops level should be rechecked upon receiving an exception. Currently the level is being cached and never updated. Update bkops each time the level is checked. Also do not use the cached bkops level value if it is disabled and then enabled. Fixes: afdfff59a0e0 (scsi: ufs: handle non spec compliant bkops behaviour by device) Link: https://lore.kernel.org/r/1574751214-8321-2-git-send-email-cang@qti.qualcomm.com Reviewed-by: Bean Huo Reviewed-by: Alim Akhtar Tested-by: Alim Akhtar Signed-off-by: Asutosh Das Signed-off-by: Can Guo Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 5f8c02d32223206cad19b0854988a3a2f485b56f Author: Nathan Chancellor Date: Tue Dec 17 18:52:52 2019 -0700 scsi: qla4xxx: Adjust indentation in qla4xxx_mem_free commit aa8679736a82386551eb9f3ea0e6ebe2c0e99104 upstream. Clang warns: ../drivers/scsi/qla4xxx/ql4_os.c:4148:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] if (ha->fw_dump) ^ ../drivers/scsi/qla4xxx/ql4_os.c:4144:2: note: previous statement is here if (ha->queues) ^ 1 warning generated. This warning occurs because there is a space after the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: 068237c87c64 ("[SCSI] qla4xxx: Capture minidump for ISP82XX on firmware failure") Link: https://github.com/ClangBuiltLinux/linux/issues/819 Link: https://lore.kernel.org/r/20191218015252.20890-1-natechancellor@gmail.com Acked-by: Manish Rangankar Reviewed-by: Nick Desaulniers Signed-off-by: Nathan Chancellor Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit bdc7651e61644489e0b2928c40e0efd508c45766 Author: Nathan Chancellor Date: Tue Dec 17 18:47:26 2019 -0700 scsi: csiostor: Adjust indentation in csio_device_reset commit a808a04c861782e31fc30e342a619c144aaee14a upstream. Clang warns: ../drivers/scsi/csiostor/csio_scsi.c:1386:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] csio_lnodes_exit(hw, 1); ^ ../drivers/scsi/csiostor/csio_scsi.c:1382:2: note: previous statement is here if (*buf != '1') ^ 1 warning generated. This warning occurs because there is a space after the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: a3667aaed569 ("[SCSI] csiostor: Chelsio FCoE offload driver") Link: https://github.com/ClangBuiltLinux/linux/issues/818 Link: https://lore.kernel.org/r/20191218014726.8455-1-natechancellor@gmail.com Signed-off-by: Nathan Chancellor Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 454db8d6163b583e6404a0a212a438c691847bd7 Author: Bart Van Assche Date: Wed Dec 18 16:49:05 2019 -0800 scsi: qla2xxx: Fix the endianness of the qla82xx_get_fw_size() return type commit 3f5f7335e5e234e340b48ecb24c2aba98a61f934 upstream. Since qla82xx_get_fw_size() returns a number in CPU-endian format, change its return type from __le32 into u32. This patch does not change any functionality. Fixes: 9c2b297572bf ("[SCSI] qla2xxx: Support for loading Unified ROM Image (URI) format firmware file.") Cc: Himanshu Madhani Cc: Quinn Tran Cc: Martin Wilck Cc: Daniel Wagner Cc: Roman Bolshakov Link: https://lore.kernel.org/r/20191219004905.39586-1-bvanassche@acm.org Reviewed-by: Daniel Wagner Reviewed-by: Roman Bolshakov Signed-off-by: Bart Van Assche Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 05dceb5a62ce5b8b6fe466d92101a0bef13e275e Author: Jerome Brunet Date: Wed Dec 18 18:24:17 2019 +0100 ASoC: meson: axg-fifo: fix fifo threshold setup commit 864cee90d4bd870e5d5e5a0b1a6f055f4f951350 upstream. On TODDR sm1, the fifo threshold register field is slightly different compared to the other SoCs. This leads to the fifo A being flushed to memory every 8kB. If the period is smaller than that, several periods are pushed to memory and notified at once. This is not ideal. Fix the register field update. With this, the fifos are flushed every 128B. We could still do better, like adapt the threshold depending on the period size, but at least it consistent across the different SoC/fifos Fixes: 5ac825c3d85e ("ASoC: meson: axg-toddr: add sm1 support") Reported-by: Alden DSouza Signed-off-by: Jerome Brunet Link: https://lore.kernel.org/r/20191218172420.1199117-2-jbrunet@baylibre.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 7c662f69fa0530c45afa100ab4aaf0d9c8d92199 Author: Erdem Aktas Date: Fri Dec 13 13:31:46 2019 -0800 percpu: Separate decrypted varaibles anytime encryption can be enabled commit 264b0d2bee148073c117e7bbbde5be7125a53be1 upstream. CONFIG_VIRTUALIZATION may not be enabled for memory encrypted guests. If disabled, decrypted per-CPU variables may end up sharing the same page with variables that should be left encrypted. Always separate per-CPU variables that should be decrypted into their own page anytime memory encryption can be enabled in the guest rather than rely on any other config option that may not be enabled. Fixes: ac26963a1175 ("percpu: Introduce DEFINE_PER_CPU_DECRYPTED") Cc: stable@vger.kernel.org # 4.15+ Signed-off-by: Erdem Aktas Signed-off-by: David Rientjes Signed-off-by: Dennis Zhou Signed-off-by: Greg Kroah-Hartman commit 59c458d510998ff95d17c7cf778cc7af05a63684 Author: Casey Schaufler Date: Mon Feb 3 09:15:00 2020 -0800 broken ping to ipv6 linklocal addresses on debian buster commit 87fbfffcc89b92a4281b0aa53bd06af714087889 upstream. I am seeing ping failures to IPv6 linklocal addresses with Debian buster. Easiest example to reproduce is: $ ping -c1 -w1 ff02::1%eth1 connect: Invalid argument $ ping -c1 -w1 ff02::1%eth1 PING ff02::01%eth1(ff02::1%eth1) 56 data bytes 64 bytes from fe80::e0:f9ff:fe0c:37%eth1: icmp_seq=1 ttl=64 time=0.059 ms git bisect traced the failure to commit b9ef5513c99b ("smack: Check address length before reading address family") Arguably ping is being stupid since the buster version is not setting the address family properly (ping on stretch for example does): $ strace -e connect ping6 -c1 -w1 ff02::1%eth1 connect(5, {sa_family=AF_UNSPEC, sa_data="\4\1\0\0\0\0\377\2\0\0\0\0\0\0\0\0\0\0\0\0\0\1\3\0\0\0"}, 28) = -1 EINVAL (Invalid argument) but the command works fine on kernels prior to this commit, so this is breakage which goes against the Linux paradigm of "don't break userspace" Cc: stable@vger.kernel.org Reported-by: David Ahern Suggested-by: Tetsuo Handa Signed-off-by: Casey Schaufler Signed-off-by: Greg Kroah-Hartman  security/smack/smack_lsm.c | 41 +++++++++++++++++++---------------------- security/smack/smack_lsm.c | 41 +++++++++++++++++++---------------------- 1 file changed, 19 insertions(+), 22 deletions(-) commit 07fbef9a6e1885b2b321d337706ebf4c87501307 Author: Miklos Szeredi Date: Thu Feb 6 16:39:28 2020 +0100 fix up iter on short count in fuse_direct_io() commit f658adeea45e430a24c7a157c3d5448925ac2038 upstream. fuse_direct_io() can end up advancing the iterator by more than the amount of data read or written. This case is handled by the generic code if going through ->direct_IO(), but not in the FOPEN_DIRECT_IO case. Fix by reverting the extra bytes from the iterator in case of error or a short count. To test: install lxcfs, then the following testcase int fd = open("/var/lib/lxcfs/proc/uptime", O_RDONLY); sendfile(1, fd, NULL, 16777216); sendfile(1, fd, NULL, 16777216); will spew WARN_ON() in iov_iter_pipe(). Reported-by: Peter Geis Reported-by: Al Viro Fixes: 3c3db095b68c ("fuse: use iov_iter based generic splice helpers") Cc: # v5.1 Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit e0fc65ef8a60df6c784843354c29238fe8c3db5f Author: Daniel Verkamp Date: Fri Jan 3 10:40:45 2020 -0800 virtio-pci: check name when counting MSI-X vectors commit 303090b513fd1ee45aa1536b71a3838dc054bc05 upstream. VQs without a name specified are not valid; they are skipped in the later loop that assigns MSI-X vectors to queues, but the per_vq_vectors loop above that counts the required number of vectors previously still counted any queue with a non-NULL callback as needing a vector. Add a check to the per_vq_vectors loop so that vectors with no name are not counted to make the two loops consistent. This prevents over-counting unnecessary vectors (e.g. for features which were not negotiated with the device). Cc: stable@vger.kernel.org Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Reviewed-by: Cornelia Huck Signed-off-by: Daniel Verkamp Signed-off-by: Michael S. Tsirkin Reviewed-by: Wang, Wei W Signed-off-by: Greg Kroah-Hartman commit f603b3714e4e37135e0fd10297e02d2cc96240b9 Author: Daniel Verkamp Date: Fri Jan 3 10:40:43 2020 -0800 virtio-balloon: initialize all vq callbacks commit 5790b53390e18fdd21e70776e46d058c05eda2f2 upstream. Ensure that elements of the callbacks array that correspond to unavailable features are set to NULL; previously, they would be left uninitialized. Since the corresponding names array elements were explicitly set to NULL, the uninitialized callback pointers would not actually be dereferenced; however, the uninitialized callbacks elements would still be read in vp_find_vqs_msix() and used to calculate the number of MSI-X vectors required. Cc: stable@vger.kernel.org Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Reviewed-by: Cornelia Huck Signed-off-by: Daniel Verkamp Signed-off-by: Michael S. Tsirkin Signed-off-by: Greg Kroah-Hartman commit fe84d084b2e9f9c6c641246ca15c402e35c11077 Author: Lyude Paul Date: Fri Jan 24 14:10:46 2020 -0500 drm/amd/dm/mst: Ignore payload update failures commit 58fe03d6dec908a1bec07eea7e94907af5c07eec upstream. Disabling a display on MST can potentially happen after the entire MST topology has been removed, which means that we can't communicate with the topology at all in this scenario. Likewise, this also means that we can't properly update payloads on the topology and as such, it's a good idea to ignore payload update failures when disabling displays. Currently, amdgpu makes the mistake of halting the payload update process when any payload update failures occur, resulting in leaving DC's local copies of the payload tables out of date. This ends up causing problems with hotplugging MST topologies, and causes modesets on the second hotplug to fail like so: [drm] Failed to updateMST allocation table forpipe idx:1 ------------[ cut here ]------------ WARNING: CPU: 5 PID: 1511 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:2677 update_mst_stream_alloc_table+0x11e/0x130 [amdgpu] Modules linked in: cdc_ether usbnet fuse xt_conntrack nf_conntrack nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 nft_counter nft_compat nf_tables nfnetlink tun bridge stp llc sunrpc vfat fat wmi_bmof uvcvideo snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi videobuf2_vmalloc snd_hda_intel videobuf2_memops videobuf2_v4l2 snd_intel_dspcfg videobuf2_common crct10dif_pclmul snd_hda_codec videodev crc32_pclmul snd_hwdep snd_hda_core ghash_clmulni_intel snd_seq mc joydev pcspkr snd_seq_device snd_pcm sp5100_tco k10temp i2c_piix4 snd_timer thinkpad_acpi ledtrig_audio snd wmi soundcore video i2c_scmi acpi_cpufreq ip_tables amdgpu(O) rtsx_pci_sdmmc amd_iommu_v2 gpu_sched mmc_core i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm crc32c_intel serio_raw hid_multitouch r8152 mii nvme r8169 nvme_core rtsx_pci pinctrl_amd CPU: 5 PID: 1511 Comm: gnome-shell Tainted: G O 5.5.0-rc7Lyude-Test+ #4 Hardware name: LENOVO FA495SIT26/FA495SIT26, BIOS R12ET22W(0.22 ) 01/31/2019 RIP: 0010:update_mst_stream_alloc_table+0x11e/0x130 [amdgpu] Code: 28 00 00 00 75 2b 48 8d 65 e0 5b 41 5c 41 5d 41 5e 5d c3 0f b6 06 49 89 1c 24 41 88 44 24 08 0f b6 46 01 41 88 44 24 09 eb 93 <0f> 0b e9 2f ff ff ff e8 a6 82 a3 c2 66 0f 1f 44 00 00 0f 1f 44 00 RSP: 0018:ffffac428127f5b0 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff8d1e166eee80 RCX: 0000000000000000 RDX: ffffac428127f668 RSI: ffff8d1e166eee80 RDI: ffffac428127f610 RBP: ffffac428127f640 R08: ffffffffc03d94a8 R09: 0000000000000000 R10: ffff8d1e24b02000 R11: ffffac428127f5b0 R12: ffff8d1e1b83d000 R13: ffff8d1e1bea0b08 R14: 0000000000000002 R15: 0000000000000002 FS: 00007fab23ffcd80(0000) GS:ffff8d1e28b40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f151f1711e8 CR3: 00000005997c0000 CR4: 00000000003406e0 Call Trace: ? mutex_lock+0xe/0x30 dc_link_allocate_mst_payload+0x9a/0x210 [amdgpu] ? dm_read_reg_func+0x39/0xb0 [amdgpu] ? core_link_enable_stream+0x656/0x730 [amdgpu] core_link_enable_stream+0x656/0x730 [amdgpu] dce110_apply_ctx_to_hw+0x58e/0x5d0 [amdgpu] ? dcn10_verify_allow_pstate_change_high+0x1d/0x280 [amdgpu] ? dcn10_wait_for_mpcc_disconnect+0x3c/0x130 [amdgpu] dc_commit_state+0x292/0x770 [amdgpu] ? add_timer+0x101/0x1f0 ? ttm_bo_put+0x1a1/0x2f0 [ttm] amdgpu_dm_atomic_commit_tail+0xb59/0x1ff0 [amdgpu] ? amdgpu_move_blit.constprop.0+0xb8/0x1f0 [amdgpu] ? amdgpu_bo_move+0x16d/0x2b0 [amdgpu] ? ttm_bo_handle_move_mem+0x118/0x570 [ttm] ? ttm_bo_validate+0x134/0x150 [ttm] ? dm_plane_helper_prepare_fb+0x1b9/0x2a0 [amdgpu] ? _cond_resched+0x15/0x30 ? wait_for_completion_timeout+0x38/0x160 ? _cond_resched+0x15/0x30 ? wait_for_completion_interruptible+0x33/0x190 commit_tail+0x94/0x130 [drm_kms_helper] drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper] drm_atomic_helper_set_config+0x70/0xb0 [drm_kms_helper] drm_mode_setcrtc+0x194/0x6a0 [drm] ? _cond_resched+0x15/0x30 ? mutex_lock+0xe/0x30 ? drm_mode_getcrtc+0x180/0x180 [drm] drm_ioctl_kernel+0xaa/0xf0 [drm] drm_ioctl+0x208/0x390 [drm] ? drm_mode_getcrtc+0x180/0x180 [drm] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] do_vfs_ioctl+0x458/0x6d0 ksys_ioctl+0x5e/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x1b0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fab2121f87b Code: 0f 1e fa 48 8b 05 0d 96 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d dd 95 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffd045f9068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffd045f90a0 RCX: 00007fab2121f87b RDX: 00007ffd045f90a0 RSI: 00000000c06864a2 RDI: 000000000000000b RBP: 00007ffd045f90a0 R08: 0000000000000000 R09: 000055dbd2985d10 R10: 000055dbd2196280 R11: 0000000000000246 R12: 00000000c06864a2 R13: 000000000000000b R14: 0000000000000000 R15: 000055dbd2196280 ---[ end trace 6ea888c24d2059cd ]--- Note as well, I have only been able to reproduce this on setups with 2 MST displays. Changes since v1: * Don't return false when part 1 or part 2 of updating the payloads fails, we don't want to abort at any step of the process even if things fail Reviewed-by: Mikita Lipski Signed-off-by: Lyude Paul Acked-by: Harry Wentland Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit f4bda8b61e7cba32791d5fbfa6e76e46aefb6d92 Author: Stephen Warren Date: Thu Oct 3 14:50:30 2019 -0600 clk: tegra: Mark fuse clock as critical commit bf83b96f87ae2abb1e535306ea53608e8de5dfbb upstream. For a little over a year, U-Boot on Tegra124 has configured the flow controller to perform automatic RAM re-repair on off->on power transitions of the CPU rail[1]. This is mandatory for correct operation of Tegra124. However, RAM re-repair relies on certain clocks, which the kernel must enable and leave running. The fuse clock is one of those clocks. Mark this clock as critical so that LP1 power mode (system suspend) operates correctly. [1] 3cc7942a4ae5 ARM: tegra: implement RAM repair Reported-by: Jonathan Hunter Cc: stable@vger.kernel.org Signed-off-by: Stephen Warren Signed-off-by: Thierry Reding Signed-off-by: Greg Kroah-Hartman commit 806cabd3117f46b505d7a7fdb33d3ce1db99e77a Author: Peter Zijlstra Date: Mon Feb 3 17:36:49 2020 -0800 mm/mmu_gather: invalidate TLB correctly on batch allocation failure and flush commit 0ed1325967ab5f7a4549a2641c6ebe115f76e228 upstream. Architectures for which we have hardware walkers of Linux page table should flush TLB on mmu gather batch allocation failures and batch flush. Some architectures like POWER supports multiple translation modes (hash and radix) and in the case of POWER only radix translation mode needs the above TLBI. This is because for hash translation mode kernel wants to avoid this extra flush since there are no hardware walkers of linux page table. With radix translation, the hardware also walks linux page table and with that, kernel needs to make sure to TLB invalidate page walk cache before page table pages are freed. More details in commit d86564a2f085 ("mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE") The changes to sparc are to make sure we keep the old behavior since we are now removing HAVE_RCU_TABLE_NO_INVALIDATE. The default value for tlb_needs_table_invalidate is to always force an invalidate and sparc can avoid the table invalidate. Hence we define tlb_needs_table_invalidate to false for sparc architecture. Link: http://lkml.kernel.org/r/20200116064531.483522-3-aneesh.kumar@linux.ibm.com Fixes: a46cc7a90fd8 ("powerpc/mm/radix: Improve TLB/PWC flushes") Signed-off-by: Peter Zijlstra (Intel) Acked-by: Michael Ellerman [powerpc] Cc: [4.14+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 091c96151279a95c8166c81ecf14e8e2c8b45d53 Author: Niklas Cassel Date: Mon Oct 14 14:09:20 2019 +0200 arm64: dts: qcom: qcs404-evb: Set vdd_apc regulator in high power mode commit eac8ce86cb90ba96cb4bcbf2549d7a8b6938aa30 upstream. vdd_apc is the regulator that supplies the main CPU cluster. At sudden CPU load changes, we have noticed invalid page faults on addresses with all bits shifted, as well as on addresses with individual bits flipped. By putting the vdd_apc regulator in high power mode, the voltage drops during sudden load changes will be less severe, and we have not been able to reproduce the invalid page faults with the regulator in this mode. Fixes: 8faea8edbb35 ("arm64: dts: qcom: qcs404-evb: add spmi regulators") Cc: stable@vger.kernel.org Suggested-by: Bjorn Andersson Signed-off-by: Niklas Cassel Reviewed-by: Vinod Koul Link: https://lore.kernel.org/r/20191014120920.12691-1-niklas.cassel@linaro.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit ed53278ee834c319b77bf91fc0f282b51112a44a Author: David Hildenbrand Date: Mon Feb 3 17:33:48 2020 -0800 mm/page_alloc.c: fix uninitialized memmaps on a partially populated last section commit e822969cab48b786b64246aad1a3ba2a774f5d23 upstream. Patch series "mm: fix max_pfn not falling on section boundary", v2. Playing with different memory sizes for a x86-64 guest, I discovered that some memmaps (highest section if max_mem does not fall on the section boundary) are marked as being valid and online, but contain garbage. We have to properly initialize these memmaps. Looking at /proc/kpageflags and friends, I found some more issues, partially related to this. This patch (of 3): If max_pfn is not aligned to a section boundary, we can easily run into BUGs. This can e.g., be triggered on x86-64 under QEMU by specifying a memory size that is not a multiple of 128MB (e.g., 4097MB, but also 4160MB). I was told that on real HW, we can easily have this scenario (esp., one of the main reasons sub-section hotadd of devmem was added). The issue is, that we have a valid memmap (pfn_valid()) for the whole section, and the whole section will be marked "online". pfn_to_online_page() will succeed, but the memmap contains garbage. E.g., doing a "./page-types -r -a 0x144001" when QEMU was started with "-m 4160M" - (see tools/vm/page-types.c): [ 200.476376] BUG: unable to handle page fault for address: fffffffffffffffe [ 200.477500] #PF: supervisor read access in kernel mode [ 200.478334] #PF: error_code(0x0000) - not-present page [ 200.479076] PGD 59614067 P4D 59614067 PUD 59616067 PMD 0 [ 200.479557] Oops: 0000 [#4] SMP NOPTI [ 200.479875] CPU: 0 PID: 603 Comm: page-types Tainted: G D W 5.5.0-rc1-next-20191209 #93 [ 200.480646] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu4 [ 200.481648] RIP: 0010:stable_page_flags+0x4d/0x410 [ 200.482061] Code: f3 ff 41 89 c0 48 b8 00 00 00 00 01 00 00 00 45 84 c0 0f 85 cd 02 00 00 48 8b 53 08 48 8b 2b 48f [ 200.483644] RSP: 0018:ffffb139401cbe60 EFLAGS: 00010202 [ 200.484091] RAX: fffffffffffffffe RBX: fffffbeec5100040 RCX: 0000000000000000 [ 200.484697] RDX: 0000000000000001 RSI: ffffffff9535c7cd RDI: 0000000000000246 [ 200.485313] RBP: ffffffffffffffff R08: 0000000000000000 R09: 0000000000000000 [ 200.485917] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000144001 [ 200.486523] R13: 00007ffd6ba55f48 R14: 00007ffd6ba55f40 R15: ffffb139401cbf08 [ 200.487130] FS: 00007f68df717580(0000) GS:ffff9ec77fa00000(0000) knlGS:0000000000000000 [ 200.487804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 200.488295] CR2: fffffffffffffffe CR3: 0000000135d48000 CR4: 00000000000006f0 [ 200.488897] Call Trace: [ 200.489115] kpageflags_read+0xe9/0x140 [ 200.489447] proc_reg_read+0x3c/0x60 [ 200.489755] vfs_read+0xc2/0x170 [ 200.490037] ksys_pread64+0x65/0xa0 [ 200.490352] do_syscall_64+0x5c/0xa0 [ 200.490665] entry_SYSCALL_64_after_hwframe+0x49/0xbe But it can be triggered much easier via "cat /proc/kpageflags > /dev/null" after cold/hot plugging a DIMM to such a system: [root@localhost ~]# cat /proc/kpageflags > /dev/null [ 111.517275] BUG: unable to handle page fault for address: fffffffffffffffe [ 111.517907] #PF: supervisor read access in kernel mode [ 111.518333] #PF: error_code(0x0000) - not-present page [ 111.518771] PGD a240e067 P4D a240e067 PUD a2410067 PMD 0 This patch fixes that by at least zero-ing out that memmap (so e.g., page_to_pfn() will not crash). Commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages") tried to fix a similar issue, but forgot to consider this special case. After this patch, there are still problems to solve. E.g., not all of these pages falling into a memory hole will actually get initialized later and set PageReserved - they are only zeroed out - but at least the immediate crashes are gone. A follow-up patch will take care of this. Link: http://lkml.kernel.org/r/20191211163201.17179-2-david@redhat.com Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap") Signed-off-by: David Hildenbrand Tested-by: Daniel Jordan Cc: Naoya Horiguchi Cc: Pavel Tatashin Cc: Andrew Morton Cc: Steven Sistare Cc: Michal Hocko Cc: Daniel Jordan Cc: Bob Picco Cc: Oscar Salvador Cc: Alexey Dobriyan Cc: Dan Williams Cc: Michal Hocko Cc: Stephen Rothwell Cc: [4.15+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 03c03090c3aae1464bd2c3002540a4b42db4ae31 Author: Gang He Date: Mon Feb 3 17:33:45 2020 -0800 ocfs2: fix oops when writing cloned file commit 2d797e9ff95ecbcf0a83d657928ed20579444857 upstream. Writing a cloned file triggers a kernel oops and the user-space command process is also killed by the system. The bug can be reproduced stably via: 1) create a file under ocfs2 file system directory. journalctl -b > aa.txt 2) create a cloned file for this file. reflink aa.txt bb.txt 3) write the cloned file with dd command. dd if=/dev/zero of=bb.txt bs=512 count=1 conv=notrunc The dd command is killed by the kernel, then you can see the oops message via dmesg command. [ 463.875404] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 463.875413] #PF: supervisor read access in kernel mode [ 463.875416] #PF: error_code(0x0000) - not-present page [ 463.875418] PGD 0 P4D 0 [ 463.875425] Oops: 0000 [#1] SMP PTI [ 463.875431] CPU: 1 PID: 2291 Comm: dd Tainted: G OE 5.3.16-2-default [ 463.875433] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 463.875500] RIP: 0010:ocfs2_refcount_cow+0xa4/0x5d0 [ocfs2] [ 463.875505] Code: 06 89 6c 24 38 89 eb f6 44 24 3c 02 74 be 49 8b 47 28 [ 463.875508] RSP: 0018:ffffa2cb409dfce8 EFLAGS: 00010202 [ 463.875512] RAX: ffff8b1ebdca8000 RBX: 0000000000000001 RCX: ffff8b1eb73a9df0 [ 463.875515] RDX: 0000000000056a01 RSI: 0000000000000000 RDI: 0000000000000000 [ 463.875517] RBP: 0000000000000001 R08: ffff8b1eb73a9de0 R09: 0000000000000000 [ 463.875520] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 463.875522] R13: ffff8b1eb922f048 R14: 0000000000000000 R15: ffff8b1eb922f048 [ 463.875526] FS: 00007f8f44d15540(0000) GS:ffff8b1ebeb00000(0000) knlGS:0000000000000000 [ 463.875529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 463.875532] CR2: 0000000000000028 CR3: 000000003c17a000 CR4: 00000000000006e0 [ 463.875546] Call Trace: [ 463.875596] ? ocfs2_inode_lock_full_nested+0x18b/0x960 [ocfs2] [ 463.875648] ocfs2_file_write_iter+0xaf8/0xc70 [ocfs2] [ 463.875672] new_sync_write+0x12d/0x1d0 [ 463.875688] vfs_write+0xad/0x1a0 [ 463.875697] ksys_write+0xa1/0xe0 [ 463.875710] do_syscall_64+0x60/0x1f0 [ 463.875743] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 463.875758] RIP: 0033:0x7f8f4482ed44 [ 463.875762] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 [ 463.875765] RSP: 002b:00007fff300a79d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 463.875769] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f8f4482ed44 [ 463.875771] RDX: 0000000000000200 RSI: 000055f771b5c000 RDI: 0000000000000001 [ 463.875774] RBP: 0000000000000200 R08: 00007f8f44af9c78 R09: 0000000000000003 [ 463.875776] R10: 000000000000089f R11: 0000000000000246 R12: 000055f771b5c000 [ 463.875779] R13: 0000000000000200 R14: 0000000000000000 R15: 000055f771b5c000 This regression problem was introduced by commit e74540b28556 ("ocfs2: protect extent tree in ocfs2_prepare_inode_for_write()"). Link: http://lkml.kernel.org/r/20200121050153.13290-1-ghe@suse.com Fixes: e74540b28556 ("ocfs2: protect extent tree in ocfs2_prepare_inode_for_write()"). Signed-off-by: Gang He Reviewed-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Jun Piao Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 6e41b54999609e910eccc5e900a2e30560d4d9a2 Author: Christian Borntraeger Date: Fri Jan 31 05:02:00 2020 -0500 KVM: s390: do not clobber registers during guest reset/store status commit 55680890ea78be0df5e1384989f1be835043c084 upstream. The initial CPU reset clobbers the userspace fpc and the store status ioctl clobbers the guest acrs + fpr. As these calls are only done via ioctl (and not via vcpu_run), no CPU context is loaded, so we can (and must) act directly on the sync regs, not on the thread context. Cc: stable@kernel.org Fixes: e1788bb995be ("KVM: s390: handle floating point registers in the run ioctl not in vcpu_put/load") Fixes: 31d8b8d41a7e ("KVM: s390: handle access registers in the run ioctl not in vcpu_put/load") Signed-off-by: Christian Borntraeger Reviewed-by: David Hildenbrand Reviewed-by: Cornelia Huck Signed-off-by: Janosch Frank Link: https://lore.kernel.org/r/20200131100205.74720-2-frankja@linux.ibm.com Signed-off-by: Christian Borntraeger Signed-off-by: Greg Kroah-Hartman commit b1f9f9b84374900eaa1a5f65dac3b67a97291b19 Author: Sean Christopherson Date: Fri Jan 17 11:30:51 2020 -0800 KVM: x86: Revert "KVM: X86: Fix fpu state crash in kvm guest" commit 2620fe268e80d667a94553cd37a94ccaa2cb8c83 upstream. Reload the current thread's FPU state, which contains the guest's FPU state, to the CPU registers if necessary during vcpu_enter_guest(). TIF_NEED_FPU_LOAD can be set any time control is transferred out of KVM, e.g. if I/O is triggered during a KVM call to get_user_pages() or if a softirq occurs while KVM is scheduled in. Moving the handling of TIF_NEED_FPU_LOAD from vcpu_enter_guest() to kvm_arch_vcpu_load(), effectively kvm_sched_in(), papered over a bug where kvm_put_guest_fpu() failed to account for TIF_NEED_FPU_LOAD. The easiest way to the kvm_put_guest_fpu() bug was to run with involuntary preemption enable, thus handling TIF_NEED_FPU_LOAD during kvm_sched_in() made the bug go away. But, removing the handling in vcpu_enter_guest() exposed KVM to the rare case of a softirq triggering kernel_fpu_begin() between vcpu_load() and vcpu_enter_guest(). Now that kvm_{load,put}_guest_fpu() correctly handle TIF_NEED_FPU_LOAD, revert the commit to both restore the vcpu_enter_guest() behavior and eliminate the superfluous switch_fpu_return() in kvm_arch_vcpu_load(). Note, leaving the handling in kvm_arch_vcpu_load() isn't wrong per se, but it is unnecessary, and most critically, makes it extremely difficult to find bugs such as the kvm_put_guest_fpu() issue due to shrinking the window where a softirq can corrupt state. A sample trace triggered by warning if TIF_NEED_FPU_LOAD is set while vcpu state is loaded: gcmaes_crypt_by_sg.constprop.12+0x26e/0x660 ? 0xffffffffc024547d ? __qdisc_run+0x83/0x510 ? __dev_queue_xmit+0x45e/0x990 ? ip_finish_output2+0x1a8/0x570 ? fib4_rule_action+0x61/0x70 ? fib4_rule_action+0x70/0x70 ? fib_rules_lookup+0x13f/0x1c0 ? helper_rfc4106_decrypt+0x82/0xa0 ? crypto_aead_decrypt+0x40/0x70 ? crypto_aead_decrypt+0x40/0x70 ? crypto_aead_decrypt+0x40/0x70 ? esp_output_tail+0x8f4/0xa5a [esp4] ? skb_ext_add+0xd3/0x170 ? xfrm_input+0x7a6/0x12c0 ? xfrm4_rcv_encap+0xae/0xd0 ? xfrm4_transport_finish+0x200/0x200 ? udp_queue_rcv_one_skb+0x1ba/0x460 ? udp_unicast_rcv_skb.isra.63+0x72/0x90 ? __udp4_lib_rcv+0x51b/0xb00 ? ip_protocol_deliver_rcu+0xd2/0x1c0 ? ip_local_deliver_finish+0x44/0x50 ? ip_local_deliver+0xe0/0xf0 ? ip_protocol_deliver_rcu+0x1c0/0x1c0 ? ip_rcv+0xbc/0xd0 ? ip_rcv_finish_core.isra.19+0x380/0x380 ? __netif_receive_skb_one_core+0x7e/0x90 ? netif_receive_skb_internal+0x3d/0xb0 ? napi_gro_receive+0xed/0x150 ? 0xffffffffc0243c77 ? net_rx_action+0x149/0x3b0 ? __do_softirq+0xe4/0x2f8 ? handle_irq_event_percpu+0x6a/0x80 ? irq_exit+0xe6/0xf0 ? do_IRQ+0x7f/0xd0 ? common_interrupt+0xf/0xf ? irq_entries_start+0x20/0x660 ? vmx_get_interrupt_shadow+0x2f0/0x710 [kvm_intel] ? kvm_set_msr_common+0xfc7/0x2380 [kvm] ? recalibrate_cpu_khz+0x10/0x10 ? ktime_get+0x3a/0xa0 ? kvm_arch_vcpu_ioctl_run+0x107/0x560 [kvm] ? kvm_init+0x6bf/0xd00 [kvm] ? __seccomp_filter+0x7a/0x680 ? do_vfs_ioctl+0xa4/0x630 ? security_file_ioctl+0x32/0x50 ? ksys_ioctl+0x60/0x90 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x5f/0x1a0 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace 9564a1ccad733a90 ]--- This reverts commit e751732486eb3f159089a64d1901992b1357e7cc. Fixes: e751732486eb3 ("KVM: X86: Fix fpu state crash in kvm guest") Reported-by: Derek Yerger Reported-by: kernel@najdan.com Cc: Wanpeng Li Cc: Thomas Lambertz Cc: Rik van Riel Cc: Sebastian Andrzej Siewior Cc: Borislav Petkov Cc: Dave Hansen Cc: Thomas Gleixner Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 58e1e7514f06ac71a71d6e1059f4ee7813be3c37 Author: Sean Christopherson Date: Fri Jan 17 11:30:50 2020 -0800 KVM: x86: Ensure guest's FPU state is loaded when accessing for emulation commit a7baead7e312f5a05381d68585fb6dc68e19e90f upstream. Lock the FPU regs and reload the current thread's FPU state, which holds the guest's FPU state, to the CPU registers if necessary prior to accessing guest FPU state as part of emulation. kernel_fpu_begin() can be called from softirq context, therefore KVM must ensure softirqs are disabled (locking the FPU regs disables softirqs) when touching CPU FPU state. Note, for all intents and purposes this reverts commit 6ab0b9feb82a7 ("x86,kvm: remove KVM emulator get_fpu / put_fpu"), but at the time it was applied, removing get/put_fpu() was correct. The re-introduction of {get,put}_fpu() is necessitated by the deferring of FPU state load. Fixes: 5f409e20b7945 ("x86/fpu: Defer FPU state load until return to userspace") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit a6ff6e0546d0c579d68edf2e009240ed031c6ebc Author: Sean Christopherson Date: Fri Jan 17 11:30:49 2020 -0800 KVM: x86: Handle TIF_NEED_FPU_LOAD in kvm_{load,put}_guest_fpu() commit c9aef3b85f425d1f6635382ec210ee5a7ef55d7d upstream. Handle TIF_NEED_FPU_LOAD similar to how fpu__copy() handles the flag when duplicating FPU state to a new task struct. TIF_NEED_FPU_LOAD can be set any time control is transferred out of KVM, be it voluntarily, e.g. if I/O is triggered during a KVM call to get_user_pages, or involuntarily, e.g. if softirq runs after an IRQ occurs. Therefore, KVM must account for TIF_NEED_FPU_LOAD whenever it is (potentially) accessing CPU FPU state. Fixes: 5f409e20b7945 ("x86/fpu: Defer FPU state load until return to userspace") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit e3a37628c46d31c9dc84593340acd2b8c361d604 Author: Sean Christopherson Date: Wed Dec 18 13:54:48 2019 -0800 KVM: x86: Free wbinvd_dirty_mask if vCPU creation fails commit 16be9ddea268ad841457a59109963fff8c9de38d upstream. Free the vCPU's wbinvd_dirty_mask if vCPU creation fails after kvm_arch_vcpu_init(), e.g. when installing the vCPU's file descriptor. Do the freeing by calling kvm_arch_vcpu_free() instead of open coding the freeing. This adds a likely superfluous, but ultimately harmless, call to kvmclock_reset(), which only clears vcpu->arch.pv_time_enabled. Using kvm_arch_vcpu_free() allows for additional cleanup in the future. Fixes: f5f48ee15c2ee ("KVM: VMX: Execute WBINVD to keep data consistency with assigned devices") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 9d9933f7f3f4f78f38f588efba686a3167c10a9e Author: Sean Christopherson Date: Tue Dec 10 14:44:13 2019 -0800 KVM: x86: Don't let userspace set host-reserved cr4 bits commit b11306b53b2540c6ba068c4deddb6a17d9f8d95b upstream. Calculate the host-reserved cr4 bits at runtime based on the system's capabilities (using logic similar to __do_cpuid_func()), and use the dynamically generated mask for the reserved bit check in kvm_set_cr4() instead using of the static CR4_RESERVED_BITS define. This prevents userspace from "enabling" features in cr4 that are not supported by the system, e.g. by ignoring KVM_GET_SUPPORTED_CPUID and specifying a bogus CPUID for the vCPU. Allowing userspace to set unsupported bits in cr4 can lead to a variety of undesirable behavior, e.g. failed VM-Enter, and in general increases KVM's attack surface. A crafty userspace can even abuse CR4.LA57 to induce an unchecked #GP on a WRMSR. On a platform without LA57 support: KVM_SET_CPUID2 // CPUID_7_0_ECX.LA57 = 1 KVM_SET_SREGS // CR4.LA57 = 1 KVM_SET_MSRS // KERNEL_GS_BASE = 0x0004000000000000 KVM_RUN leads to a #GP when writing KERNEL_GS_BASE into hardware: unchecked MSR access error: WRMSR to 0xc0000102 (tried to write 0x0004000000000000) at rIP: 0xffffffffa00f239a (vmx_prepare_switch_to_guest+0x10a/0x1d0 [kvm_intel]) Call Trace: kvm_arch_vcpu_ioctl_run+0x671/0x1c70 [kvm] kvm_vcpu_ioctl+0x36b/0x5d0 [kvm] do_vfs_ioctl+0xa1/0x620 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4c/0x170 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fc08133bf47 Note, the above sequence fails VM-Enter due to invalid guest state. Userspace can allow VM-Enter to succeed (after the WRMSR #GP) by adding a KVM_SET_SREGS w/ CR4.LA57=0 after KVM_SET_MSRS, in which case KVM will technically leak the host's KERNEL_GS_BASE into the guest. But, as KERNEL_GS_BASE is a userspace-defined value/address, the leak is largely benign as a malicious userspace would simply be exposing its own data to the guest, and attacking a benevolent userspace would require multiple bugs in the userspace VMM. Cc: stable@vger.kernel.org Cc: Jun Nakajima Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 715f9f9a07687f3df6dd95f62a09e53ffdb07650 Author: Sean Christopherson Date: Tue Dec 10 15:24:32 2019 -0800 KVM: VMX: Add non-canonical check on writes to RTIT address MSRs commit fe6ed369fca98e99df55c932b85782a5687526b5 upstream. Reject writes to RTIT address MSRs if the data being written is a non-canonical address as the MSRs are subject to canonical checks, e.g. KVM will trigger an unchecked #GP when loading the values to hardware during pt_guest_enter(). Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 2aebc6ed84efeb1e255fd6417f9230388f2206cd Author: Boris Ostrovsky Date: Fri Dec 6 15:36:12 2019 +0000 x86/KVM: Clean up host's steal time structure commit a6bd811f1209fe1c64c9f6fd578101d6436c6b6e upstream. Now that we are mapping kvm_steal_time from the guest directly we don't need keep a copy of it in kvm_vcpu_arch.st. The same is true for the stime field. This is part of CVE-2019-3016. Signed-off-by: Boris Ostrovsky Reviewed-by: Joao Martins Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit f7c1a6c67ff36532f1b0b339e3aae7701a2c0b1e Author: Boris Ostrovsky Date: Thu Dec 5 01:30:51 2019 +0000 x86/kvm: Cache gfn to pfn translation commit 917248144db5d7320655dbb41d3af0b8a0f3d589 upstream. __kvm_map_gfn()'s call to gfn_to_pfn_memslot() is * relatively expensive * in certain cases (such as when done from atomic context) cannot be called Stashing gfn-to-pfn mapping should help with both cases. This is part of CVE-2019-3016. Signed-off-by: Boris Ostrovsky Reviewed-by: Joao Martins Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit d71eef9fcc0b81fd56e59afd305a215d81239894 Author: Boris Ostrovsky Date: Thu Dec 5 03:45:32 2019 +0000 x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is not missed commit b043138246a41064527cf019a3d51d9f015e9796 upstream. There is a potential race in record_steal_time() between setting host-local vcpu->arch.st.steal.preempted to zero (i.e. clearing KVM_VCPU_PREEMPTED) and propagating this value to the guest with kvm_write_guest_cached(). Between those two events the guest may still see KVM_VCPU_PREEMPTED in its copy of kvm_steal_time, set KVM_VCPU_FLUSH_TLB and assume that hypervisor will do the right thing. Which it won't. Instad of copying, we should map kvm_steal_time and that will guarantee atomicity of accesses to @preempted. This is part of CVE-2019-3016. Signed-off-by: Boris Ostrovsky Reviewed-by: Joao Martins Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit a3db2949904b81ae53a840d99f71021f02a01fd3 Author: Boris Ostrovsky Date: Tue Nov 12 16:35:06 2019 +0000 x86/kvm: Introduce kvm_(un)map_gfn() commit 1eff70a9abd46f175defafd29bc17ad456f398a7 upstream. kvm_vcpu_(un)map operates on gfns from any current address space. In certain cases we want to make sure we are not mapping SMRAM and for that we can use kvm_(un)map_gfn() that we are introducing in this patch. This is part of CVE-2019-3016. Signed-off-by: Boris Ostrovsky Reviewed-by: Joao Martins Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 68460ceba319a46ea14b36129bfd0a152e0f00c3 Author: Boris Ostrovsky Date: Wed Oct 30 19:01:31 2019 +0000 x86/kvm: Be careful not to clear KVM_VCPU_FLUSH_TLB bit commit 8c6de56a42e0c657955e12b882a81ef07d1d073e upstream. kvm_steal_time_set_preempted() may accidentally clear KVM_VCPU_FLUSH_TLB bit if it is called more than once while VCPU is preempted. This is part of CVE-2019-3016. (This bug was also independently discovered by Jim Mattson ) Signed-off-by: Boris Ostrovsky Reviewed-by: Joao Martins Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit d0671151c2b9b3a3c233f186e3986716bb890a85 Author: John Allen Date: Thu Dec 19 14:17:59 2019 -0600 kvm/svm: PKU not currently supported commit a47970ed74a535b1accb4bc73643fd5a93993c3e upstream. Current SVM implementation does not have support for handling PKU. Guests running on a host with future AMD cpus that support the feature will read garbage from the PKRU register and will hit segmentation faults on boot as memory is getting marked as protected that should not be. Ensure that cpuid from SVM does not advertise the feature. Signed-off-by: John Allen Cc: stable@vger.kernel.org Fixes: 0556cbdc2fbc ("x86/pkeys: Don't check if PKRU is zero before writing it") Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 9213699efd1fdb9cf1c2ab5d6248d1daed5e1294 Author: Sean Christopherson Date: Wed Dec 18 13:54:47 2019 -0800 KVM: PPC: Book3S PR: Free shared page if mmu initialization fails commit cb10bf9194f4d2c5d830eddca861f7ca0fecdbb4 upstream. Explicitly free the shared page if kvmppc_mmu_init() fails during kvmppc_core_vcpu_create(), as the page is freed only in kvmppc_core_vcpu_free(), which is not reached via kvm_vcpu_uninit(). Fixes: 96bc451a15329 ("KVM: PPC: Introduce shared page") Cc: stable@vger.kernel.org Reviewed-by: Greg Kurz Signed-off-by: Sean Christopherson Acked-by: Paul Mackerras Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit b2301deda8ce921ee7b24604289cd320afeb6ebd Author: Sean Christopherson Date: Wed Dec 18 13:54:46 2019 -0800 KVM: PPC: Book3S HV: Uninit vCPU if vcore creation fails commit 1a978d9d3e72ddfa40ac60d26301b154247ee0bc upstream. Call kvm_vcpu_uninit() if vcore creation fails to avoid leaking any resources allocated by kvm_vcpu_init(), i.e. the vcpu->run page. Fixes: 371fefd6f2dc4 ("KVM: PPC: Allow book3s_hv guests to use SMT processor modes") Cc: stable@vger.kernel.org Reviewed-by: Greg Kurz Signed-off-by: Sean Christopherson Acked-by: Paul Mackerras Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 0718e2d3dc540a667faefdd4572d3de6ac92d258 Author: Sean Christopherson Date: Mon Dec 9 12:19:31 2019 -0800 KVM: x86: Fix potential put_fpu() w/o load_fpu() on MPX platform commit f958bd2314d117f8c29f4821401bc1925bc2e5ef upstream. Unlike most state managed by XSAVE, MPX is initialized to zero on INIT. Because INITs are usually recognized in the context of a VCPU_RUN call, kvm_vcpu_reset() puts the guest's FPU so that the FPU state is resident in memory, zeros the MPX state, and reloads FPU state to hardware. But, in the unlikely event that an INIT is recognized during kvm_arch_vcpu_ioctl_get_mpstate() via kvm_apic_accept_events(), kvm_vcpu_reset() will call kvm_put_guest_fpu() without a preceding kvm_load_guest_fpu() and corrupt the guest's FPU state (and possibly userspace's FPU state as well). Given that MPX is being removed from the kernel[*], fix the bug with the simple-but-ugly approach of loading the guest's FPU during KVM_GET_MP_STATE. [*] See commit f240652b6032b ("x86/mpx: Remove MPX APIs"). Fixes: f775b13eedee2 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 72324a1db6a159cb3c4f9dc3fe8e1b2a88b8269f Author: Marios Pomonis Date: Wed Dec 11 12:47:47 2019 -0800 KVM: x86: Protect MSR-based index computations in fixed_msr_to_seg_unit() from Spectre-v1/L1TF attacks commit 25a5edea71b7c154b6a0b8cec14c711cafa31d26 upstream. This fixes a Spectre-v1/L1TF vulnerability in fixed_msr_to_seg_unit(). This function contains index computations based on the (attacker-controlled) MSR number. Fixes: de9aef5e1ad6 ("KVM: MTRR: introduce fixed_mtrr_segment table") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 2fb35312c070c1d7d5633ce19038c82dea44160b Author: Marios Pomonis Date: Wed Dec 11 12:47:41 2019 -0800 KVM: x86: Protect x86_decode_insn from Spectre-v1/L1TF attacks commit 3c9053a2cae7ba2ba73766a34cea41baa70f57f7 upstream. This fixes a Spectre-v1/L1TF vulnerability in x86_decode_insn(). kvm_emulate_instruction() (an ancestor of x86_decode_insn()) is an exported symbol, so KVM should treat it conservatively from a security perspective. Fixes: 045a282ca415 ("KVM: emulator: implement fninit, fnstsw, fnstcw") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit f2a514318263a23296f759d141a6c38b9db60c29 Author: Marios Pomonis Date: Wed Dec 11 12:47:49 2019 -0800 KVM: x86: Protect MSR-based index computations from Spectre-v1/L1TF attacks in x86.c commit 6ec4c5eee1750d5d17951c4e1960d953376a0dda upstream. This fixes a Spectre-v1/L1TF vulnerability in set_msr_mce() and get_msr_mce(). Both functions contain index computations based on the (attacker-controlled) MSR number. Fixes: 890ca9aefa78 ("KVM: Add MCE support") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit a07fdd5fcb300f611770b640635e35769c89dd77 Author: Marios Pomonis Date: Wed Dec 11 12:47:44 2019 -0800 KVM: x86: Protect ioapic_read_indirect() from Spectre-v1/L1TF attacks commit 8c86405f606ca8508b8d9280680166ca26723695 upstream. This fixes a Spectre-v1/L1TF vulnerability in ioapic_read_indirect(). This function contains index computations based on the (attacker-controlled) IOREGSEL register. Fixes: a2c118bfab8b ("KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit c09be769b48dcb2971a0f74fd9c3e90d86420dac Author: Marios Pomonis Date: Wed Dec 11 12:47:48 2019 -0800 KVM: x86: Protect MSR-based index computations in pmu.h from Spectre-v1/L1TF attacks commit 13c5183a4e643cc2b03a22d0e582c8e17bb7457d upstream. This fixes a Spectre-v1/L1TF vulnerability in the get_gp_pmc() and get_fixed_pmc() functions. They both contain index computations based on the (attacker-controlled) MSR number. Fixes: 25462f7f5295 ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 2f8a13754e058f12939b398b1958f73f7273a208 Author: Marios Pomonis Date: Wed Dec 11 12:47:45 2019 -0800 KVM: x86: Protect ioapic_write_indirect() from Spectre-v1/L1TF attacks commit 670564559ca35b439c8d8861fc399451ddf95137 upstream. This fixes a Spectre-v1/L1TF vulnerability in ioapic_write_indirect(). This function contains index computations based on the (attacker-controlled) IOREGSEL register. This patch depends on patch "KVM: x86: Protect ioapic_read_indirect() from Spectre-v1/L1TF attacks". Fixes: 70f93dae32ac ("KVM: Use temporary variable to shorten lines.") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit c8a6b59189424cd82c9fa585831d5e176ae3f0c4 Author: Marios Pomonis Date: Wed Dec 11 12:47:42 2019 -0800 KVM: x86: Protect kvm_hv_msr_[get|set]_crash_data() from Spectre-v1/L1TF attacks commit 8618793750071d66028584a83ed0b4fa7eb4f607 upstream. This fixes Spectre-v1/L1TF vulnerabilities in kvm_hv_msr_get_crash_data() and kvm_hv_msr_set_crash_data(). These functions contain index computations that use the (attacker-controlled) MSR number. Fixes: e7d9513b60e8 ("kvm/x86: added hyper-v crash msrs into kvm hyperv context") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit bf13472e5c05da94aa8b110e3d9efef427bf6622 Author: Marios Pomonis Date: Wed Dec 11 12:47:46 2019 -0800 KVM: x86: Protect kvm_lapic_reg_write() from Spectre-v1/L1TF attacks commit 4bf79cb089f6b1c6c632492c0271054ce52ad766 upstream. This fixes a Spectre-v1/L1TF vulnerability in kvm_lapic_reg_write(). This function contains index computations based on the (attacker-controlled) MSR number. Fixes: 0105d1a52640 ("KVM: x2apic interface to lapic") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 8b73ccf4b47515ef7a127b9c12279c6c98fa893e Author: Marios Pomonis Date: Wed Dec 11 12:47:52 2019 -0800 KVM: x86: Protect DR-based index computations from Spectre-v1/L1TF attacks commit ea740059ecb37807ba47b84b33d1447435a8d868 upstream. This fixes a Spectre-v1/L1TF vulnerability in __kvm_set_dr() and kvm_get_dr(). Both kvm_get_dr() and kvm_set_dr() (a wrapper of __kvm_set_dr()) are exported symbols so KVM should tream them conservatively from a security perspective. Fixes: 020df0794f57 ("KVM: move DR register access handling into generic code") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit c2b02d093a08d78c1eebac85ff210aab30eb78f3 Author: Marios Pomonis Date: Wed Dec 11 12:47:53 2019 -0800 KVM: x86: Protect pmu_intel.c from Spectre-v1/L1TF attacks commit 66061740f1a487f4ed54fde75e724709f805da53 upstream. This fixes Spectre-v1/L1TF vulnerabilities in intel_find_fixed_event() and intel_rdpmc_ecx_to_pmc(). kvm_rdpmc() (ancestor of intel_find_fixed_event()) and reprogram_fixed_counter() (ancestor of intel_rdpmc_ecx_to_pmc()) are exported symbols so KVM should treat them conservatively from a security perspective. Fixes: 25462f7f5295 ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 79777eb82c232488a650b77ca6dd0923a57b3e1c Author: Marios Pomonis Date: Wed Dec 11 12:47:50 2019 -0800 KVM: x86: Refactor prefix decoding to prevent Spectre-v1/L1TF attacks commit 125ffc5e0a56a3eded608dc51e09d5ebf72cf652 upstream. This fixes Spectre-v1/L1TF vulnerabilities in vmx_read_guest_seg_selector(), vmx_read_guest_seg_base(), vmx_read_guest_seg_limit() and vmx_read_guest_seg_ar(). When invoked from emulation, these functions contain index computations based on the (attacker-influenced) segment value. Using constants prevents the attack. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 443fd0049dcf7734b6de2e17d028b67e82be2cf7 Author: Marios Pomonis Date: Wed Dec 11 12:47:43 2019 -0800 KVM: x86: Refactor picdev_write() to prevent Spectre-v1/L1TF attacks commit 14e32321f3606e4b0970200b6e5e47ee6f1e6410 upstream. This fixes a Spectre-v1/L1TF vulnerability in picdev_write(). It replaces index computations based on the (attacked-controlled) port number with constants through a minor refactoring. Fixes: 85f455f7ddbe ("KVM: Add support for in-kernel PIC emulation") Signed-off-by: Nick Finco Signed-off-by: Marios Pomonis Reviewed-by: Andrew Honig Cc: stable@vger.kernel.org Reviewed-by: Jim Mattson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 8dcbf26833cc0ef66717b84aeebb040ad9be5eea Author: Jens Axboe Date: Mon Feb 3 10:33:42 2020 -0700 aio: prevent potential eventfd recursion on poll commit 01d7a356872eec22ef34a33a5f9cfa917d145468 upstream. If we have nested or circular eventfd wakeups, then we can deadlock if we run them inline from our poll waitqueue wakeup handler. It's also possible to have very long chains of notifications, to the extent where we could risk blowing the stack. Check the eventfd recursion count before calling eventfd_signal(). If it's non-zero, then punt the signaling to async context. This is always safe, as it takes us out-of-line in terms of stack and locking context. Cc: stable@vger.kernel.org # 4.19+ Reviewed-by: Jeff Moyer Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 844d2025b68d9f493129b0a9ec0738ebbfe6e847 Author: Jens Axboe Date: Sun Feb 2 08:23:03 2020 -0700 eventfd: track eventfd_signal() recursion depth commit b5e683d5cab8cd433b06ae178621f083cabd4f63 upstream. eventfd use cases from aio and io_uring can deadlock due to circular or resursive calling, when eventfd_signal() tries to grab the waitqueue lock. On top of that, it's also possible to construct notification chains that are deep enough that we could blow the stack. Add a percpu counter that tracks the percpu recursion depth, warn if we exceed it. The counter is also exposed so that users of eventfd_signal() can do the right thing if it's non-zero in the context where it is called. Cc: stable@vger.kernel.org # 4.19+ Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit d5d6b588724128a6ff8cc1267d4aee53965aeaba Author: Coly Li Date: Sat Feb 1 22:42:33 2020 +0800 bcache: add readahead cache policy options via sysfs interface commit 038ba8cc1bffc51250add4a9b9249d4331576d8f upstream. In year 2007 high performance SSD was still expensive, in order to save more space for real workload or meta data, the readahead I/Os for non-meta data was bypassed and not cached on SSD. In now days, SSD price drops a lot and people can find larger size SSD with more comfortable price. It is unncessary to alway bypass normal readahead I/Os to save SSD space for now. This patch adds options for readahead data cache policies via sysfs file /sys/block/bcache/readahead_cache_policy, the options are, - "all": cache all readahead data I/Os. - "meta-only": only cache meta data, and bypass other regular I/Os. If users want to make bcache continue to only cache readahead request for metadata and bypass regular data readahead, please set "meta-only" to this sysfs file. By default, bcache will back to cache all read- ahead requests now. Cc: stable@vger.kernel.org Signed-off-by: Coly Li Acked-by: Eric Wheeler Cc: Michael Lyle Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit f158399c1fe9d4dbcb0adb5a87b1e1426c64b02d Author: Vladis Dronov Date: Wed Jan 8 13:53:47 2020 +0100 watchdog: fix UAF in reboot notifier handling in watchdog core code commit 69503e585192fdd84b240f18a0873d20e18a2e0a upstream. After the commit 44ea39420fc9 ("drivers/watchdog: make use of devm_register_reboot_notifier()") the struct notifier_block reboot_nb in the struct watchdog_device is removed from the reboot notifiers chain at the time watchdog's chardev is closed. But at least in i6300esb.c case reboot_nb is embedded in the struct esb_dev which can be freed on its device removal and before the chardev is closed, thus UAF at reboot: [ 7.728581] esb_probe: esb_dev.watchdog_device ffff91316f91ab28 ts# uname -r note the address ^^^ 5.5.0-rc5-ae6088-wdog ts# ./openwdog0 & [1] 696 ts# opened /dev/watchdog0, sleeping 10s... ts# echo 1 > /sys/devices/pci0000\:00/0000\:00\:09.0/remove [ 178.086079] devres:rel_nodes: dev ffff91317668a0b0 data ffff91316f91ab28 esb_dev.watchdog_device.reboot_nb memory is freed here ^^^ ts# ...woken up [ 181.459010] devres:rel_nodes: dev ffff913171781000 data ffff913174a1dae8 [ 181.460195] devm_unreg_reboot_notifier: res ffff913174a1dae8 nb ffff91316f91ab78 attempt to use memory already freed ^^^ [ 181.461063] devm_unreg_reboot_notifier: nb->call 6b6b6b6b6b6b6b6b [ 181.461243] devm_unreg_reboot_notifier: nb->next 6b6b6b6b6b6b6b6b freed memory is filled with a slub poison ^^^ [1]+ Done ./openwdog0 ts# reboot [ 229.921862] systemd-shutdown[1]: Rebooting. [ 229.939265] notifier_call_chain: nb ffffffff9c6c2f20 nb->next ffffffff9c6d50c0 [ 229.943080] notifier_call_chain: nb ffffffff9c6d50c0 nb->next 6b6b6b6b6b6b6b6b [ 229.946054] notifier_call_chain: nb 6b6b6b6b6b6b6b6b INVAL [ 229.957584] general protection fault: 0000 [#1] SMP [ 229.958770] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.5.0-rc5-ae6088-wdog [ 229.960224] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... [ 229.963288] RIP: 0010:notifier_call_chain+0x66/0xd0 [ 229.969082] RSP: 0018:ffffb20dc0013d88 EFLAGS: 00010246 [ 229.970812] RAX: 000000000000002e RBX: 6b6b6b6b6b6b6b6b RCX: 00000000000008b3 [ 229.972929] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff9ccc46ac [ 229.975028] RBP: 0000000000000001 R08: 0000000000000000 R09: 00000000000008b3 [ 229.977039] R10: 0000000000000001 R11: ffffffff9c26c740 R12: 0000000000000000 [ 229.979155] R13: 6b6b6b6b6b6b6b6b R14: 0000000000000000 R15: 00000000fffffffa ... slub_debug=FZP poison ^^^ [ 229.989089] Call Trace: [ 229.990157] blocking_notifier_call_chain+0x43/0x59 [ 229.991401] kernel_restart_prepare+0x14/0x30 [ 229.992607] kernel_restart+0x9/0x30 [ 229.993800] __do_sys_reboot+0x1d2/0x210 [ 230.000149] do_syscall_64+0x3d/0x130 [ 230.001277] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 230.002639] RIP: 0033:0x7f5461bdd177 [ 230.016402] Modules linked in: i6300esb [ 230.050261] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Fix the crash by reverting 44ea39420fc9 so unregister_reboot_notifier() is called when watchdog device is removed. This also makes handling of the reboot notifier unified with the handling of the restart handler, which is freed with unregister_restart_handler() in the same place. Fixes: 44ea39420fc9 ("drivers/watchdog: make use of devm_register_reboot_notifier()") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Vladis Dronov Reviewed-by: Guenter Roeck Link: https://lore.kernel.org/r/20200108125347.6067-1-vdronov@redhat.com Signed-off-by: Guenter Roeck Signed-off-by: Wim Van Sebroeck Signed-off-by: Greg Kroah-Hartman commit c157da57304e5ea015ea9ed960650d278adf01ea Author: Juergen Gross Date: Fri Jan 17 14:49:31 2020 +0100 xen/balloon: Support xend-based toolstack take two commit eda4eabf86fd6806eaabc23fb90dd056fdac037b upstream. Commit 3aa6c19d2f38be ("xen/balloon: Support xend-based toolstack") tried to fix a regression with running on rather ancient Xen versions. Unfortunately the fix was based on the assumption that xend would just use another Xenstore node, but in reality only some downstream versions of xend are doing that. The upstream xend does not write that Xenstore node at all, so the problem must be fixed in another way. The easiest way to achieve that is to fall back to the behavior before commit 96edd61dcf4436 ("xen/balloon: don't online new memory initially") in case the static memory maximum can't be read. This is achieved by setting static_max to the current number of memory pages known by the system resulting in target_diff becoming zero. Fixes: 3aa6c19d2f38be ("xen/balloon: Support xend-based toolstack") Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky Cc: # 4.13 Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman commit 726599c637995a99c13207a80577e69eca2f84a6 Author: Gavin Shan Date: Tue Dec 10 15:48:29 2019 +1100 tools/kvm_stat: Fix kvm_exit filter name commit 5fcf3a55a62afb0760ccb6f391d62f20bce4a42f upstream. The filter name is fixed to "exit_reason" for some kvm_exit events, no matter what architect we have. Actually, the filter name ("exit_reason") is only applicable to x86, meaning it's broken on other architects including aarch64. This fixes the issue by providing various kvm_exit filter names, depending on architect we're on. Afterwards, the variable filter name is picked and applied through ioctl(fd, SET_FILTER). Reported-by: Andrew Jones Signed-off-by: Gavin Shan Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 7a19bd6fe21bcc081db364c42056bd0132bef27b Author: Sean Young Date: Thu Nov 21 11:10:47 2019 +0100 media: rc: ensure lirc is initialized before registering input device commit 080d89f522e2baddb4fbbd1af4b67b5f92537ef8 upstream. Once rc_open is called on the input device, lirc events can be delivered. Ensure lirc is ready to do so else we might get this: Registered IR keymap rc-hauppauge rc rc0: Hauppauge WinTV PVR-350 as /devices/pci0000:00/0000:00:1e.0/0000:04:00.0/i2c-0/0-0018/rc/rc0 input: Hauppauge WinTV PVR-350 as /devices/pci0000:00/0000:00:1e.0/0000:04:00.0/i2c-0/0-0018/rc/rc0/input9 BUG: kernel NULL pointer dereference, address: 0000000000000038 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.3.11-300.fc31.x86_64 #1 Hardware name: /DG43NB, BIOS NBG4310H.86A.0096.2009.0903.1845 09/03/2009 Workqueue: events ir_work [ir_kbd_i2c] RIP: 0010:ir_lirc_scancode_event+0x3d/0xb0 Code: a6 b4 07 00 00 49 81 c6 b8 07 00 00 55 53 e8 ba a7 9d ff 4c 89 e7 49 89 45 00 e8 5e 7a 25 00 49 8b 1e 48 89 c5 4c 39 f3 74 58 <8b> 43 38 8b 53 40 89 c1 2b 4b 3c 39 ca 72 41 21 d0 49 8b 7d 00 49 RSP: 0018:ffffaae2000b3d88 EFLAGS: 00010017 RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000019 RDX: 0000000000000001 RSI: 006e801b1f26ce6a RDI: ffff9e39797c37b4 RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: ffff9e39797c37b4 R13: ffffaae2000b3db8 R14: ffff9e39797c37b8 R15: ffff9e39797c33d8 FS: 0000000000000000(0000) GS:ffff9e397b680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 0000000035844000 CR4: 00000000000006e0 Call Trace: ir_do_keydown+0x8e/0x2b0 rc_keydown+0x52/0xc0 ir_work+0xb8/0x130 [ir_kbd_i2c] process_one_work+0x19d/0x340 worker_thread+0x50/0x3b0 kthread+0xfb/0x130 ? process_one_work+0x340/0x340 ? kthread_park+0x80/0x80 ret_from_fork+0x35/0x40 Modules linked in: rc_hauppauge tuner msp3400 saa7127 saa7115 ivtv(+) tveeprom cx2341x v4l2_common videodev mc i2c_algo_bit ir_kbd_i2c ip_tables firewire_ohci e1000e serio_raw firewire_core ata_generic crc_itu_t pata_acpi pata_jmicron fuse CR2: 0000000000000038 ---[ end trace c67c2697a99fa74b ]--- RIP: 0010:ir_lirc_scancode_event+0x3d/0xb0 Code: a6 b4 07 00 00 49 81 c6 b8 07 00 00 55 53 e8 ba a7 9d ff 4c 89 e7 49 89 45 00 e8 5e 7a 25 00 49 8b 1e 48 89 c5 4c 39 f3 74 58 <8b> 43 38 8b 53 40 89 c1 2b 4b 3c 39 ca 72 41 21 d0 49 8b 7d 00 49 RSP: 0018:ffffaae2000b3d88 EFLAGS: 00010017 RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000019 RDX: 0000000000000001 RSI: 006e801b1f26ce6a RDI: ffff9e39797c37b4 RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: ffff9e39797c37b4 R13: ffffaae2000b3db8 R14: ffff9e39797c37b8 R15: ffff9e39797c33d8 FS: 0000000000000000(0000) GS:ffff9e397b680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 0000000035844000 CR4: 00000000000006e0 rc rc0: lirc_dev: driver ir_kbd_i2c registered at minor = 0, scancode receiver, no transmitter tuner-simple 0-0061: creating new instance tuner-simple 0-0061: type set to 2 (Philips NTSC (FI1236,FM1236 and compatibles)) ivtv0: Registered device video0 for encoder MPG (4096 kB) ivtv0: Registered device video32 for encoder YUV (2048 kB) ivtv0: Registered device vbi0 for encoder VBI (1024 kB) ivtv0: Registered device video24 for encoder PCM (320 kB) ivtv0: Registered device radio0 for encoder radio ivtv0: Registered device video16 for decoder MPG (1024 kB) ivtv0: Registered device vbi8 for decoder VBI (64 kB) ivtv0: Registered device vbi16 for decoder VOUT ivtv0: Registered device video48 for decoder YUV (1024 kB) Cc: stable@vger.kernel.org Tested-by: Nick French Reported-by: Nick French Signed-off-by: Sean Young Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit fd52d88c1dd8476525286f731320fe8aca0b3e80 Author: Johan Hovold Date: Fri Jan 3 17:35:13 2020 +0100 media: iguanair: fix endpoint sanity check commit 1b257870a78b0a9ce98fdfb052c58542022ffb5b upstream. Make sure to use the current alternate setting, which need not be the first one by index, when verifying the endpoint descriptors and initialising the URBs. Failing to do so could cause the driver to misbehave or trigger a WARN() in usb_submit_urb() that kernels with panic_on_warn set would choke on. Fixes: 26ff63137c45 ("[media] Add support for the IguanaWorks USB IR Transceiver") Fixes: ab1cbdf159be ("media: iguanair: add sanity checks") Cc: stable # 3.6 Cc: Oliver Neukum Signed-off-by: Johan Hovold Signed-off-by: Sean Young Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit ae116f21b24a9fc09e86a2f82a99ca08c19a5bf5 Author: Ville Syrjälä Date: Fri Nov 22 19:56:20 2019 +0200 drm/rect: Avoid division by zero commit 433480c1afd44f3e1e664b85063d98cefeefa0ed upstream. Check for zero width/height destination rectangle in drm_rect_clip_scaled() to avoid a division by zero. Cc: stable@vger.kernel.org Fixes: f96bdf564f3e ("drm/rect: Handle rounding errors in drm_rect_clip_scaled, v3.") Cc: Maarten Lankhorst Cc: Benjamin Gaignard Cc: Daniel Vetter Testcase: igt/kms_selftest/drm_rect_clip_scaled_div_by_zero Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20191122175623.13565-2-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter Reviewed-by: Benjamin Gaignard Signed-off-by: Greg Kroah-Hartman commit 5b442859ab803dc48dcfb0b1d9bd6be5d8d56941 Author: Peter Rosin Date: Wed Dec 18 14:28:28 2019 +0200 drm: atmel-hlcdc: prefer a lower pixel-clock than requested commit 51a19d150b520f6cb42143f3bdffacd3c33d7ac5 upstream. The intention was to only select a higher pixel-clock rate than the requested, if a slight overclocking would result in a rate significantly closer to the requested rate than if the conservative lower pixel-clock rate is selected. The fixed patch has the logic the other way around and actually prefers the higher frequency. Fix that. Signed-off-by: Peter Rosin Signed-off-by: Claudiu Beznea Signed-off-by: Sam Ravnborg Fixes: 9946a3a9dbed ("drm/atmel-hlcdc: allow selecting a higher pixel-clock than requested") Reported-by: Claudiu Beznea Tested-by: Claudiu Beznea Cc: Boris Brezillon Cc: # v4.20+ Link: https://patchwork.freedesktop.org/patch/msgid/1576672109-22707-6-git-send-email-claudiu.beznea@microchip.com Signed-off-by: Greg Kroah-Hartman commit d065ae83dff83cd293b7121b5aaa6c24d7acd6ec Author: Claudiu Beznea Date: Wed Dec 18 14:28:25 2019 +0200 drm: atmel-hlcdc: enable clock before configuring timing engine commit 2c1fb9d86f6820abbfaa38a6836157c76ccb4e7b upstream. Changing pixel clock source without having this clock source enabled will block the timing engine and the next operations after (in this case setting ATMEL_HLCDC_CFG(5) settings in atmel_hlcdc_crtc_mode_set_nofb() will fail). It is recomended (although in datasheet this is not present) to actually enabled pixel clock source before doing any changes on timing enginge (only SAM9X60 datasheet specifies that the peripheral clock and pixel clock must be enabled before using LCD controller). Fixes: 1a396789f65a ("drm: add Atmel HLCDC Display Controller support") Signed-off-by: Claudiu Beznea Signed-off-by: Sam Ravnborg Cc: Boris Brezillon Cc: # v4.0+ Link: https://patchwork.freedesktop.org/patch/msgid/1576672109-22707-3-git-send-email-claudiu.beznea@microchip.com Signed-off-by: Greg Kroah-Hartman commit 6421785d17e1996ff421c569fefe3273c0e09013 Author: Claudiu Beznea Date: Wed Dec 18 14:28:24 2019 +0200 drm: atmel-hlcdc: use double rate for pixel clock only if supported commit 07acf4bafe81dd37eff3fbcfbbdbc48084bc202b upstream. Doubled system clock should be used as pixel cock source only if this is supported. This is emphasized by the value of atmel_hlcdc_crtc::dc::desc::fixed_clksrc. Fixes: a6eca2abdd42 ("drm: atmel-hlcdc: add config option for clock selection") Signed-off-by: Claudiu Beznea Signed-off-by: Sam Ravnborg Cc: Boris Brezillon Cc: # v5.3+ Link: https://patchwork.freedesktop.org/patch/msgid/1576672109-22707-2-git-send-email-claudiu.beznea@microchip.com Signed-off-by: Greg Kroah-Hartman commit ae35ac3c4b08115e69d9d11f84e2c4a1f4d1a65b Author: Andreas Gruenbacher Date: Tue Jan 14 17:12:18 2020 +0100 gfs2: fix O_SYNC write handling commit 6e5e41e2dc4e4413296d5a4af54ac92d7cd52317 upstream. In gfs2_file_write_iter, for direct writes, the error checking in the buffered write fallback case is incomplete. This can cause inode write errors to go undetected. Fix and clean up gfs2_file_write_iter along the way. Based on a proposed fix by Christoph Hellwig . Fixes: 967bcc91b044 ("gfs2: iomap direct I/O support") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Andreas Gruenbacher Signed-off-by: Greg Kroah-Hartman commit 637348690865fedfb10701fd5bacceb9822330a6 Author: Christoph Hellwig Date: Wed Jan 15 16:38:29 2020 +0100 gfs2: move setting current->backing_dev_info commit 4c0e8dda608a51855225c611b5c6b442f95fbc56 upstream. Set current->backing_dev_info just around the buffered write calls to prepare for the next fix. Fixes: 967bcc91b044 ("gfs2: iomap direct I/O support") Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Christoph Hellwig Signed-off-by: Andreas Gruenbacher Signed-off-by: Greg Kroah-Hartman commit c61b93fae603ef212e95c6071ca4db9afe3ab782 Author: Abhi Das Date: Tue Feb 4 14:14:56 2020 -0600 gfs2: fix gfs2_find_jhead that returns uninitialized jhead with seq 0 commit 7582026f6f3588ecebd281965c8a71aff6fb6158 upstream. When the first log header in a journal happens to have a sequence number of 0, a bug in gfs2_find_jhead() causes it to prematurely exit, and return an uninitialized jhead with seq 0. This can cause failures in the caller. For instance, a mount fails in one test case. The correct behavior is for it to continue searching through the journal to find the correct journal head with the highest sequence number. Fixes: f4686c26ecc3 ("gfs2: read journal in large chunks") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Abhi Das Signed-off-by: Andreas Gruenbacher Signed-off-by: Greg Kroah-Hartman commit 65afa6958134ae578f2f5867fe5c51952a09f0ff Author: Roberto Bergantinos Corpas Date: Tue Feb 4 11:32:56 2020 +0100 sunrpc: expiry_time should be seconds not timeval commit 3d96208c30f84d6edf9ab4fac813306ac0d20c10 upstream. When upcalling gssproxy, cache_head.expiry_time is set as a timeval, not seconds since boot. As such, RPC cache expiry logic will not clean expired objects created under auth.rpcsec.context cache. This has proven to cause kernel memory leaks on field. Using 64 bit variants of getboottime/timespec Expiration times have worked this way since 2010's c5b29f885afe "sunrpc: use seconds since boot in expiry cache". The gssproxy code introduced in 2012 added gss_proxy_save_rsc and introduced the bug. That's a while for this to lurk, but it required a bit of an extreme case to make it obvious. Signed-off-by: Roberto Bergantinos Corpas Cc: stable@vger.kernel.org Fixes: 030d794bf498 "SUNRPC: Use gssproxy upcall for server..." Tested-By: Frank Sorenson Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit 301763168c5f3e8b17ec56c7f98232f362f230f4 Author: Brian Norris Date: Mon Jan 6 14:42:12 2020 -0800 mwifiex: fix unbalanced locking in mwifiex_process_country_ie() commit 65b1aae0d9d5962faccc06bdb8e91a2a0b09451c upstream. We called rcu_read_lock(), so we need to call rcu_read_unlock() before we return. Fixes: 3d94a4a8373b ("mwifiex: fix possible heap overflow in mwifiex_process_country_ie()") Cc: stable@vger.kernel.org Cc: huangwen Cc: Ganapathi Bhat Signed-off-by: Brian Norris Acked-by: Ganapathi Bhat Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit 535a755d6e6d272c680a5627fcbdc0e38a398324 Author: Luca Coelho Date: Fri Jan 31 15:45:25 2020 +0200 iwlwifi: don't throw error when trying to remove IGTK commit 197288d5ba8a5289f22d3aeb4fca3824bfd9b4af upstream. The IGTK keys are only removed by mac80211 after it has already removed the AP station. This causes the driver to throw an error because mac80211 is trying to remove the IGTK when the station doesn't exist anymore. The firmware is aware that the station has been removed and can deal with it the next time we try to add an IGTK for a station, so we shouldn't try to remove the key if the station ID is IWL_MVM_INVALID_STA. Do this by removing the check for mvm_sta before calling iwl_mvm_send_sta_igtk() and check return from that function gracefully if the station ID is invalid. Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: Luca Coelho Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit cbe53807a14d9100da75e4056cf125a8119aeaf3 Author: Stephen Warren Date: Thu Oct 3 14:50:31 2019 -0600 ARM: tegra: Enable PLLP bypass during Tegra124 LP1 commit 1a3388d506bf5b45bb283e6a4c4706cfb4897333 upstream. For a little over a year, U-Boot has configured the flow controller to perform automatic RAM re-repair on off->on power transitions of the CPU rail[1]. This is mandatory for correct operation of Tegra124. However, RAM re-repair relies on certain clocks, which the kernel must enable and leave running. PLLP is one of those clocks. This clock is shut down during LP1 in order to save power. Enable bypass (which I believe routes osc_div_clk, essentially the crystal clock, to the PLL output) so that this clock signal toggles even though the PLL is not active. This is required so that LP1 power mode (system suspend) operates correctly. The bypass configuration must then be undone when resuming from LP1, so that all peripheral clocks run at the expected rate. Without this, many peripherals won't work correctly; for example, the UART baud rate would be incorrect. NVIDIA's downstream kernel code only does this if not compiled for Tegra30, so the added code is made conditional upon the chip ID. NVIDIA's downstream code makes this change conditional upon the active CPU cluster. The upstream kernel currently doesn't support cluster switching, so this patch doesn't test the active CPU cluster ID. [1] 3cc7942a4ae5 ARM: tegra: implement RAM repair Reported-by: Jonathan Hunter Cc: stable@vger.kernel.org Signed-off-by: Stephen Warren Signed-off-by: Thierry Reding Signed-off-by: Greg Kroah-Hartman commit 9e78c0e7426192400b482820a82b5f1b4b65b206 Author: Nikolay Borisov Date: Mon Jan 27 11:59:26 2020 +0200 btrfs: Correctly handle empty trees in find_first_clear_extent_bit commit 5750c37523a2c8cbb450b9ef31e21c2ba876b05e upstream. Raviu reported that running his regular fs_trim segfaulted with the following backtrace: [ 237.525947] assertion failed: prev, in ../fs/btrfs/extent_io.c:1595 [ 237.525984] ------------[ cut here ]------------ [ 237.525985] kernel BUG at ../fs/btrfs/ctree.h:3117! [ 237.525992] invalid opcode: 0000 [#1] SMP PTI [ 237.525998] CPU: 4 PID: 4423 Comm: fstrim Tainted: G U OE 5.4.14-8-vanilla #1 [ 237.526001] Hardware name: ASUSTeK COMPUTER INC. [ 237.526044] RIP: 0010:assfail.constprop.58+0x18/0x1a [btrfs] [ 237.526079] Call Trace: [ 237.526120] find_first_clear_extent_bit+0x13d/0x150 [btrfs] [ 237.526148] btrfs_trim_fs+0x211/0x3f0 [btrfs] [ 237.526184] btrfs_ioctl_fitrim+0x103/0x170 [btrfs] [ 237.526219] btrfs_ioctl+0x129a/0x2ed0 [btrfs] [ 237.526227] ? filemap_map_pages+0x190/0x3d0 [ 237.526232] ? do_filp_open+0xaf/0x110 [ 237.526238] ? _copy_to_user+0x22/0x30 [ 237.526242] ? cp_new_stat+0x150/0x180 [ 237.526247] ? do_vfs_ioctl+0xa4/0x640 [ 237.526278] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs] [ 237.526283] do_vfs_ioctl+0xa4/0x640 [ 237.526288] ? __do_sys_newfstat+0x3c/0x60 [ 237.526292] ksys_ioctl+0x70/0x80 [ 237.526297] __x64_sys_ioctl+0x16/0x20 [ 237.526303] do_syscall_64+0x5a/0x1c0 [ 237.526310] entry_SYSCALL_64_after_hwframe+0x49/0xbe That was due to btrfs_fs_device::aloc_tree being empty. Initially I thought this wasn't possible and as a percaution have put the assert in find_first_clear_extent_bit. Turns out this is indeed possible and could happen when a file system with SINGLE data/metadata profile has a 2nd device added. Until balance is run or a new chunk is allocated on this device it will be completely empty. In this case find_first_clear_extent_bit should return the full range [0, -1ULL] and let the caller handle this i.e for trim the end will be capped at the size of actual device. Link: https://lore.kernel.org/linux-btrfs/izW2WNyvy1dEDweBICizKnd2KDwDiDyY2EYQr4YCwk7pkuIpthx-JRn65MPBde00ND6V0_Lh8mW0kZwzDiLDv25pUYWxkskWNJnVP0kgdMA=@protonmail.com/ Fixes: 45bfcfc168f8 ("btrfs: Implement find_first_clear_extent_bit") CC: stable@vger.kernel.org # 5.2+ Signed-off-by: Nikolay Borisov Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit d82ff2d6406d2e335c246525f71db5ee8a845c40 Author: Josef Bacik Date: Thu Jan 23 15:33:02 2020 -0500 btrfs: flush write bio if we loop in extent_write_cache_pages commit 42ffb0bf584ae5b6b38f72259af1e0ee417ac77f upstream. There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 #3 [ffffc900297bb708] __lock_page at ffffffff811f145b #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02e1423 ("Btrfs: Add writepages support") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana Signed-off-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 5e7a9ad78d5684e6595dda23321764b561e130e3 Author: Filipe Manana Date: Wed Jan 22 12:23:20 2020 +0000 Btrfs: fix race between adding and putting tree mod seq elements and nodes commit 7227ff4de55d931bbdc156c8ef0ce4f100c78a5b upstream. There is a race between adding and removing elements to the tree mod log list and rbtree that can lead to use-after-free problems. Consider the following example that explains how/why the problems happens: 1) Task A has mod log element with sequence number 200. It currently is the only element in the mod log list; 2) Task A calls btrfs_put_tree_mod_seq() because it no longer needs to access the tree mod log. When it enters the function, it initializes 'min_seq' to (u64)-1. Then it acquires the lock 'tree_mod_seq_lock' before checking if there are other elements in the mod seq list. Since the list it empty, 'min_seq' remains set to (u64)-1. Then it unlocks the lock 'tree_mod_seq_lock'; 3) Before task A acquires the lock 'tree_mod_log_lock', task B adds itself to the mod seq list through btrfs_get_tree_mod_seq() and gets a sequence number of 201; 4) Some other task, name it task C, modifies a btree and because there elements in the mod seq list, it adds a tree mod elem to the tree mod log rbtree. That node added to the mod log rbtree is assigned a sequence number of 202; 5) Task B, which is doing fiemap and resolving indirect back references, calls btrfs get_old_root(), with 'time_seq' == 201, which in turn calls tree_mod_log_search() - the search returns the mod log node from the rbtree with sequence number 202, created by task C; 6) Task A now acquires the lock 'tree_mod_log_lock', starts iterating the mod log rbtree and finds the node with sequence number 202. Since 202 is less than the previously computed 'min_seq', (u64)-1, it removes the node and frees it; 7) Task B still has a pointer to the node with sequence number 202, and it dereferences the pointer itself and through the call to __tree_mod_log_rewind(), resulting in a use-after-free problem. This issue can be triggered sporadically with the test case generic/561 from fstests, and it happens more frequently with a higher number of duperemove processes. When it happens to me, it either freezes the VM or it produces a trace like the following before crashing: [ 1245.321140] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [ 1245.321200] CPU: 1 PID: 26997 Comm: pool Not tainted 5.5.0-rc6-btrfs-next-52 #1 [ 1245.321235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 [ 1245.321287] RIP: 0010:rb_next+0x16/0x50 [ 1245.321307] Code: .... [ 1245.321372] RSP: 0018:ffffa151c4d039b0 EFLAGS: 00010202 [ 1245.321388] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8ae221363c80 RCX: 6b6b6b6b6b6b6b6b [ 1245.321409] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ae221363c80 [ 1245.321439] RBP: ffff8ae20fcc4688 R08: 0000000000000002 R09: 0000000000000000 [ 1245.321475] R10: ffff8ae20b120910 R11: 00000000243f8bb1 R12: 0000000000000038 [ 1245.321506] R13: ffff8ae221363c80 R14: 000000000000075f R15: ffff8ae223f762b8 [ 1245.321539] FS: 00007fdee1ec7700(0000) GS:ffff8ae236c80000(0000) knlGS:0000000000000000 [ 1245.321591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1245.321614] CR2: 00007fded4030c48 CR3: 000000021da16003 CR4: 00000000003606e0 [ 1245.321642] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1245.321668] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1245.321706] Call Trace: [ 1245.321798] __tree_mod_log_rewind+0xbf/0x280 [btrfs] [ 1245.321841] btrfs_search_old_slot+0x105/0xd00 [btrfs] [ 1245.321877] resolve_indirect_refs+0x1eb/0xc60 [btrfs] [ 1245.321912] find_parent_nodes+0x3dc/0x11b0 [btrfs] [ 1245.321947] btrfs_check_shared+0x115/0x1c0 [btrfs] [ 1245.321980] ? extent_fiemap+0x59d/0x6d0 [btrfs] [ 1245.322029] extent_fiemap+0x59d/0x6d0 [btrfs] [ 1245.322066] do_vfs_ioctl+0x45a/0x750 [ 1245.322081] ksys_ioctl+0x70/0x80 [ 1245.322092] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1245.322113] __x64_sys_ioctl+0x16/0x20 [ 1245.322126] do_syscall_64+0x5c/0x280 [ 1245.322139] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1245.322155] RIP: 0033:0x7fdee3942dd7 [ 1245.322177] Code: .... [ 1245.322258] RSP: 002b:00007fdee1ec6c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 1245.322294] RAX: ffffffffffffffda RBX: 00007fded40210d8 RCX: 00007fdee3942dd7 [ 1245.322314] RDX: 00007fded40210d8 RSI: 00000000c020660b RDI: 0000000000000004 [ 1245.322337] RBP: 0000562aa89e7510 R08: 0000000000000000 R09: 00007fdee1ec6d44 [ 1245.322369] R10: 0000000000000073 R11: 0000000000000246 R12: 00007fdee1ec6d48 [ 1245.322390] R13: 00007fdee1ec6d40 R14: 00007fded40210d0 R15: 00007fdee1ec6d50 [ 1245.322423] Modules linked in: .... [ 1245.323443] ---[ end trace 01de1e9ec5dff3cd ]--- Fix this by ensuring that btrfs_put_tree_mod_seq() computes the minimum sequence number and iterates the rbtree while holding the lock 'tree_mod_log_lock' in write mode. Also get rid of the 'tree_mod_seq_lock' lock, since it is now redundant. Fixes: bd989ba359f2ac ("Btrfs: add tree modification log functions") Fixes: 097b8a7c9e48e2 ("Btrfs: join tree mod log code with the code holding back delayed refs") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik Reviewed-by: Nikolay Borisov Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit ce066845584ae8e18d1d07920f7922fc4dc2610a Author: Josef Bacik Date: Fri Jan 17 09:12:45 2020 -0500 btrfs: drop log root for dropped roots commit 889bfa39086e86b52fcfaa04d72c95eaeb12f9a5 upstream. If we fsync on a subvolume and create a log root for that volume, and then later delete that subvolume we'll never clean up its log root. Fix this by making switch_commit_roots free the log for any dropped roots we encounter. The extra churn is because we need a btrfs_trans_handle, not the btrfs_transaction. CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Filipe Manana Signed-off-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 7baf8f665b7750449d2a7fff62dfbd206007c62d Author: Josef Bacik Date: Fri Jan 17 08:57:51 2020 -0500 btrfs: set trans->drity in btrfs_commit_transaction commit d62b23c94952e78211a383b7d90ef0afbd9a3717 upstream. If we abort a transaction we have the following sequence if (!trans->dirty && list_empty(&trans->new_bgs)) return; WRITE_ONCE(trans->transaction->aborted, err); The idea being if we didn't modify anything with our trans handle then we don't really need to abort the whole transaction, maybe the other trans handles are fine and we can carry on. However in the case of create_snapshot we add a pending_snapshot object to our transaction and then commit the transaction. We don't actually modify anything. sync() behaves the same way, attach to an existing transaction and commit it. This means that if we have an IO error in the right places we could abort the committing transaction with our trans->dirty being not set and thus not set transaction->aborted. This is a problem because in the create_snapshot() case we depend on pending->error being set to something, or btrfs_commit_transaction returning an error. If we are not the trans handle that gets to commit the transaction, and we're waiting on the commit to happen we get our return value from cur_trans->aborted. If this was not set to anything because sync() hit an error in the transaction commit before it could modify anything then cur_trans->aborted would be 0. Thus we'd return 0 from btrfs_commit_transaction() in create_snapshot. This is a problem because we then try to do things with pending_snapshot->snap, which will be NULL because we didn't create the snapshot, and then we'll get a NULL pointer dereference like the following "BUG: kernel NULL pointer dereference, address: 00000000000001f0" RIP: 0010:btrfs_orphan_cleanup+0x2d/0x330 Call Trace: ? btrfs_mksubvol.isra.31+0x3f2/0x510 btrfs_mksubvol.isra.31+0x4bc/0x510 ? __sb_start_write+0xfa/0x200 ? mnt_want_write_file+0x24/0x50 btrfs_ioctl_snap_create_transid+0x16c/0x1a0 btrfs_ioctl_snap_create_v2+0x11e/0x1a0 btrfs_ioctl+0x1534/0x2c10 ? free_debug_processing+0x262/0x2a3 do_vfs_ioctl+0xa6/0x6b0 ? do_sys_open+0x188/0x220 ? syscall_trace_enter+0x1f8/0x330 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4a/0x1b0 In order to fix this we need to make sure anybody who calls commit_transaction has trans->dirty set so that they properly set the trans->transaction->aborted value properly so any waiters know bad things happened. This was found while I was running generic/475 with my modified fsstress, it reproduced within a few runs. I ran with this patch all night and didn't see the problem again. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 78748f249199b440fd806b341aeb2a46f14c5fc8 Author: Filipe Manana Date: Wed Jan 15 13:21:35 2020 +0000 Btrfs: fix infinite loop during fsync after rename operations commit b5e4ff9d465da1233a2d9a47ebce487c70d8f4ab upstream. Recently fsstress (from fstests) sporadically started to trigger an infinite loop during fsync operations. This turned out to be because support for the rename exchange and whiteout operations was added to fsstress in fstests. These operations, unlike any others in fsstress, cause file names to be reused, whence triggering this issue. However it's not necessary to use rename exchange and rename whiteout operations trigger this issue, simple rename operations and file creations are enough to trigger the issue. The issue boils down to when we are logging inodes that conflict (that had the name of any inode we need to log during the fsync operation), we keep logging them even if they were already logged before, and after that we check if there's any other inode that conflicts with them and then add it again to the list of inodes to log. Skipping already logged inodes fixes the issue. Consider the following example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/testdir # inode 257 $ touch /mnt/testdir/zz # inode 258 $ ln /mnt/testdir/zz /mnt/testdir/zz_link $ touch /mnt/testdir/a # inode 259 $ sync # The following 3 renames achieve the same result as a rename exchange # operation ( /mnt/testdir/zz_link to /mnt/testdir/a). $ mv /mnt/testdir/a /mnt/testdir/a/tmp $ mv /mnt/testdir/zz_link /mnt/testdir/a $ mv /mnt/testdir/a/tmp /mnt/testdir/zz_link # The following rename and file creation give the same result as a # rename whiteout operation ( zz to a2). $ mv /mnt/testdir/zz /mnt/testdir/a2 $ touch /mnt/testdir/zz # inode 260 $ xfs_io -c fsync /mnt/testdir/zz --> results in the infinite loop The following steps happen: 1) When logging inode 260, we find that its reference named "zz" was used by inode 258 in the previous transaction (through the commit root), so inode 258 is added to the list of conflicting indoes that need to be logged; 2) After logging inode 258, we find that its reference named "a" was used by inode 259 in the previous transaction, and therefore we add inode 259 to the list of conflicting inodes to be logged; 3) After logging inode 259, we find that its reference named "zz_link" was used by inode 258 in the previous transaction - we add inode 258 to the list of conflicting inodes to log, again - we had already logged it before at step 3. After logging it again, we find again that inode 259 conflicts with him, and we add again 259 to the list, etc - we end up repeating all the previous steps. So fix this by skipping logging of conflicting inodes that were already logged. Fixes: 6b5fc433a7ad67 ("Btrfs: fix fsync after succession of renames of different files") CC: stable@vger.kernel.org # 5.1+ Signed-off-by: Filipe Manana Reviewed-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 79a29dee9024b6fd21cb47577435577f7bf4831e Author: Filipe Manana Date: Mon Dec 16 18:26:56 2019 +0000 Btrfs: make deduplication with range including the last block work commit 831d2fa25ab8e27592b1b0268dae6f2dfaf7cc43 upstream. Since btrfs was migrated to use the generic VFS helpers for clone and deduplication, it stopped allowing for the last block of a file to be deduplicated when the source file size is not sector size aligned (when eof is somewhere in the middle of the last block). There are two reasons for that: 1) The generic code always rounds down, to a multiple of the block size, the range's length for deduplications. This means we end up never deduplicating the last block when the eof is not block size aligned, even for the safe case where the destination range's end offset matches the destination file's size. That rounding down operation is done at generic_remap_check_len(); 2) Because of that, the btrfs specific code does not expect anymore any non-aligned range length's for deduplication and therefore does not work if such nona-aligned length is given. This patch addresses that second part, and it depends on a patch that fixes generic_remap_check_len(), in the VFS, which was submitted ealier and has the following subject: "fs: allow deduplication of eof block into the end of the destination file" These two patches address reports from users that started seeing lower deduplication rates due to the last block never being deduplicated when the file size is not aligned to the filesystem's block size. Link: https://lore.kernel.org/linux-btrfs/2019-1576167349.500456@svIo.N5dq.dFFD/ CC: stable@vger.kernel.org # 5.1+ Reviewed-by: Josef Bacik Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit ddb36ab79b11ab8c6694c3cb7a5e93211913c5c1 Author: Filipe Manana Date: Tue Nov 19 12:07:33 2019 +0000 Btrfs: fix missing hole after hole punching and fsync when using NO_HOLES commit 0e56315ca147b3e60c7bf240233a301d3c7fb508 upstream. When using the NO_HOLES feature, if we punch a hole into a file and then fsync it, there are cases where a subsequent fsync will miss the fact that a hole was punched, resulting in the holes not existing after replaying the log tree. Essentially these cases all imply that, tree-log.c:copy_items(), is not invoked for the leafs that delimit holes, because nothing changed those leafs in the current transaction. And it's precisely copy_items() where we currenly detect and log holes, which works as long as the holes are between file extent items in the input leaf or between the beginning of input leaf and the previous leaf or between the last item in the leaf and the next leaf. First example where we miss a hole: *) The extent items of the inode span multiple leafs; *) The punched hole covers a range that affects only the extent items of the first leaf; *) The fsync operation is done in full mode (BTRFS_INODE_NEEDS_FULL_SYNC is set in the inode's runtime flags). That results in the hole not existing after replaying the log tree. For example, if the fs/subvolume tree has the following layout for a particular inode: Leaf N, generation 10: [ ... INODE_ITEM INODE_REF EXTENT_ITEM (0 64K) EXTENT_ITEM (64K 128K) ] Leaf N + 1, generation 10: [ EXTENT_ITEM (128K 64K) ... ] If at transaction 11 we punch a hole coverting the range [0, 128K[, we end up dropping the two extent items from leaf N, but we don't touch the other leaf, so we end up in the following state: Leaf N, generation 11: [ ... INODE_ITEM INODE_REF ] Leaf N + 1, generation 10: [ EXTENT_ITEM (128K 64K) ... ] A full fsync after punching the hole will only process leaf N because it was modified in the current transaction, but not leaf N + 1, since it was not modified in the current transaction (generation 10 and not 11). As a result the fsync will not log any holes, because it didn't process any leaf with extent items. Second example where we will miss a hole: *) An inode as its items spanning 5 (or more) leafs; *) A hole is punched and it covers only the extents items of the 3rd leaf. This resulsts in deleting the entire leaf and not touching any of the other leafs. So the only leaf that is modified in the current transaction, when punching the hole, is the first leaf, which contains the inode item. During the full fsync, the only leaf that is passed to copy_items() is that first leaf, and that's not enough for the hole detection code in copy_items() to determine there's a hole between the last file extent item in the 2nd leaf and the first file extent item in the 3rd leaf (which was the 4th leaf before punching the hole). Fix this by scanning all leafs and punch holes as necessary when doing a full fsync (less common than a non-full fsync) when the NO_HOLES feature is enabled. The lack of explicit file extent items to mark holes makes it necessary to scan existing extents to determine if holes exist. A test case for fstests follows soon. Fixes: 16e7549f045d33 ("Btrfs: incompatible format change to remove hole extents") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit f0edd3abee0d09b14ae796fd5f2015ca9c66f149 Author: Eric Biggers Date: Thu Jan 23 20:12:34 2020 -0800 ext4: fix race conditions in ->d_compare() and ->d_hash() commit ec772f01307a2c06ebf6cdd221e6b518a71ddae7 upstream. Since ->d_compare() and ->d_hash() can be called in RCU-walk mode, ->d_parent and ->d_inode can be concurrently modified, and in particular, ->d_inode may be changed to NULL. For ext4_d_hash() this resulted in a reproducible NULL dereference if a lookup is done in a directory being deleted, e.g. with: int main() { if (fork()) { for (;;) { mkdir("subdir", 0700); rmdir("subdir"); } } else { for (;;) access("subdir/file", 0); } } ... or by running the 't_encrypted_d_revalidate' program from xfstests. Both repros work in any directory on a filesystem with the encoding feature, even if the directory doesn't actually have the casefold flag. I couldn't reproduce a crash in ext4_d_compare(), but it appears that a similar crash is possible there. Fix these bugs by reading ->d_parent and ->d_inode using READ_ONCE() and falling back to the case sensitive behavior if the inode is NULL. Reported-by: Al Viro Fixes: b886ee3e778e ("ext4: Support case-insensitive file name lookups") Cc: # v5.2+ Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20200124041234.159740-1-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit d44fa04f08642e3375a8858bdf002764a6b81230 Author: Eric Biggers Date: Tue Dec 31 12:11:49 2019 -0600 ext4: fix deadlock allocating crypto bounce page from mempool commit 547c556f4db7c09447ecf5f833ab6aaae0c5ab58 upstream. ext4_writepages() on an encrypted file has to encrypt the data, but it can't modify the pagecache pages in-place, so it encrypts the data into bounce pages and writes those instead. All bounce pages are allocated from a mempool using GFP_NOFS. This is not correct use of a mempool, and it can deadlock. This is because GFP_NOFS includes __GFP_DIRECT_RECLAIM, which enables the "never fail" mode for mempool_alloc() where a failed allocation will fall back to waiting for one of the preallocated elements in the pool. But since this mode is used for all a bio's pages and not just the first, it can deadlock waiting for pages already in the bio to be freed. This deadlock can be reproduced by patching mempool_alloc() to pretend that pool->alloc() always fails (so that it always falls back to the preallocations), and then creating an encrypted file of size > 128 KiB. Fix it by only using GFP_NOFS for the first page in the bio. For subsequent pages just use GFP_NOWAIT, and if any of those fail, just submit the bio and start a new one. This will need to be fixed in f2fs too, but that's less straightforward. Fixes: c9af28fdd449 ("ext4 crypto: don't let data integrity writebacks fail with ENOMEM") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20191231181149.47619-1-ebiggers@kernel.org Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit b19f130269c0e4730bb91f03a8e8bba3561c6a99 Author: Vasily Averin Date: Thu Jan 23 12:05:10 2020 +0300 jbd2_seq_info_next should increase position index commit 1a8e9cf40c9a6a2e40b1e924b13ed303aeea4418 upstream. if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. Script below generates endless output $ q=;while read -r r;do echo "$((++q)) $r";done Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/d13805e5-695e-8ac3-b678-26ca2313629f@virtuozzo.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit 6282102dbcbf26e20670ce932917ddcafe266e71 Author: Trond Myklebust Date: Mon Jan 6 13:18:03 2020 -0500 nfsd: fix filecache lookup commit 28c7d86bb6172ffbb1a1237c6388e77f9fe5f181 upstream. If the lookup keeps finding a nfsd_file with an unhashed open file, then retry once only. Signed-off-by: Trond Myklebust Cc: stable@vger.kernel.org Fixes: 65294c1f2c5e "nfsd: add a new struct file caching facility to nfsd" Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit 4544a6912416ca50eabe1efab01f3e25e94636d9 Author: Trond Myklebust Date: Sun Feb 2 17:53:54 2020 -0500 NFS: Directory page cache pages need to be locked when read commit 114de38225d9b300f027e2aec9afbb6e0def154b upstream. When a NFS directory page cache page is removed from the page cache, its contents are freed through a call to nfs_readdir_clear_array(). To prevent the removal of the page cache entry until after we've finished reading it, we must take the page lock. Fixes: 11de3b11e08c ("NFS: Fix a memory leak in nfs_readdir") Cc: stable@vger.kernel.org # v2.6.37+ Signed-off-by: Trond Myklebust Reviewed-by: Benjamin Coddington Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit 293cdcd89b6c57540449a55c9baccf71321f3f0d Author: Trond Myklebust Date: Sun Feb 2 17:53:53 2020 -0500 NFS: Fix memory leaks and corruption in readdir commit 4b310319c6a8ce708f1033d57145e2aa027a883c upstream. nfs_readdir_xdr_to_array() must not exit without having initialised the array, so that the page cache deletion routines can safely call nfs_readdir_clear_array(). Furthermore, we should ensure that if we exit nfs_readdir_filler() with an error, we free up any page contents to prevent a leak if we try to fill the page again. Fixes: 11de3b11e08c ("NFS: Fix a memory leak in nfs_readdir") Cc: stable@vger.kernel.org # v2.6.37+ Signed-off-by: Trond Myklebust Reviewed-by: Benjamin Coddington Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman commit 8d313c04b425e990f600a232f19f1557016648c1 Author: Arun Easi Date: Thu Jan 23 20:50:14 2020 -0800 scsi: qla2xxx: Fix unbound NVME response length commit 00fe717ee1ea3c2979db4f94b1533c57aed8dea9 upstream. On certain cases when response length is less than 32, NVME response data is supplied inline in IOCB. This is indicated by some combination of state flags. There was an instance when a high, and incorrect, response length was indicated causing driver to overrun buffers. Fix this by checking and limiting the response payload length. Fixes: 7401bc18d1ee3 ("scsi: qla2xxx: Add FC-NVMe command handling") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200124045014.23554-1-hmadhani@marvell.com Signed-off-by: Arun Easi Signed-off-by: Himanshu Madhani Reviewed-by: Ewan D. Milne Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 246a54895ac6cc117b064c5f65648076b9f2a5fc Author: Michael Ellerman Date: Fri Feb 7 22:15:46 2020 +1100 powerpc/futex: Fix incorrect user access blocking commit 9dc086f1e9ef39dd823bd27954b884b2062f9e70 upstream. The early versions of our kernel user access prevention (KUAP) were written by Russell and Christophe, and didn't have separate read/write access. At some point I picked up the series and added the read/write access, but I failed to update the usages in futex.h to correctly allow read and write. However we didn't notice because of another bug which was causing the low-level code to always enable read and write. That bug was fixed recently in commit 1d8f739b07bd ("powerpc/kuap: Fix set direction in allow/prevent_user_access()"). futex_atomic_cmpxchg_inatomic() is passed the user address as %3 and does: 1: lwarx %1, 0, %3 cmpw 0, %1, %4 bne- 3f 2: stwcx. %5, 0, %3 Which clearly loads and stores from/to %3. The logic in arch_futex_atomic_op_inuser() is similar, so fix both of them to use allow_read_write_user(). Without this fix, and with PPC_KUAP_DEBUG=y, we see eg: Bug: Read fault blocked by AMR! WARNING: CPU: 94 PID: 149215 at arch/powerpc/include/asm/book3s/64/kup-radix.h:126 __do_page_fault+0x600/0xf30 CPU: 94 PID: 149215 Comm: futex_requeue_p Tainted: G W 5.5.0-rc7-gcc9x-g4c25df5640ae #1 ... NIP [c000000000070680] __do_page_fault+0x600/0xf30 LR [c00000000007067c] __do_page_fault+0x5fc/0xf30 Call Trace: [c00020138e5637e0] [c00000000007067c] __do_page_fault+0x5fc/0xf30 (unreliable) [c00020138e5638c0] [c00000000000ada8] handle_page_fault+0x10/0x30 --- interrupt: 301 at cmpxchg_futex_value_locked+0x68/0xd0 LR = futex_lock_pi_atomic+0xe0/0x1f0 [c00020138e563bc0] [c000000000217b50] futex_lock_pi_atomic+0x80/0x1f0 (unreliable) [c00020138e563c30] [c00000000021b668] futex_requeue+0x438/0xb60 [c00020138e563d60] [c00000000021c6cc] do_futex+0x1ec/0x2b0 [c00020138e563d90] [c00000000021c8b8] sys_futex+0x128/0x200 [c00020138e563e20] [c00000000000b7ac] system_call+0x5c/0x68 Fixes: de78a9c42a79 ("powerpc: Add a framework for Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Reported-by: syzbot+e808452bad7c375cbee6@syzkaller-ppc64.appspotmail.com Signed-off-by: Michael Ellerman Reviewed-by: Christophe Leroy Link: https://lore.kernel.org/r/20200207122145.11928-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman commit eee7a67c0391906f0e4ad51004ae861515c0e10c Author: Chuhong Yuan Date: Tue Dec 10 00:21:44 2019 +0800 crypto: picoxcell - adjust the position of tasklet_init and fix missed tasklet_kill commit 7f8c36fe9be46862c4f3c5302f769378028a34fa upstream. Since tasklet is needed to be initialized before registering IRQ handler, adjust the position of tasklet_init to fix the wrong order. Besides, to fix the missed tasklet_kill, this patch adds a helper function and uses devm_add_action to kill the tasklet automatically. Fixes: ce92136843cb ("crypto: picoxcell - add support for the picoxcell crypto engines") Signed-off-by: Chuhong Yuan Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit e057d64f86d4369074d4e5f8bac6b6766bee3409 Author: Herbert Xu Date: Sat Dec 7 22:15:15 2019 +0800 crypto: api - Fix race condition in crypto_spawn_alg commit 73669cc556462f4e50376538d77ee312142e8a8a upstream. The function crypto_spawn_alg is racy because it drops the lock before shooting the dying algorithm. The algorithm could disappear altogether before we shoot it. This patch fixes it by moving the shooting into the locked section. Fixes: 6bfd48096ff8 ("[CRYPTO] api: Added spawns") Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 12a15e1c544e22eafd83bbce5dcb5de7140a1e3b Author: Tudor Ambarus Date: Thu Dec 5 09:54:01 2019 +0000 crypto: atmel-aes - Fix counter overflow in CTR mode commit 781a08d9740afa73357f1a60d45d7c93d7cca2dd upstream. 32 bit counter is not supported by neither of our AES IPs, all implement a 16 bit block counter. Drop the 32 bit block counter logic. Fixes: fcac83656a3e ("crypto: atmel-aes - fix the counter overflow in CTR mode") Signed-off-by: Tudor Ambarus Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 2c4d8203ff0c4980598f58624522fac7032e340a Author: Herbert Xu Date: Fri Nov 29 16:40:24 2019 +0800 crypto: pcrypt - Do not clear MAY_SLEEP flag in original request commit e8d998264bffade3cfe0536559f712ab9058d654 upstream. We should not be modifying the original request's MAY_SLEEP flag upon completion. It makes no sense to do so anyway. Reported-by: Eric Biggers Fixes: 5068c7a883d1 ("crypto: pcrypt - Add pcrypt crypto...") Signed-off-by: Herbert Xu Tested-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit ded7c73a2b8cf3d33f74e30f17e80ac64e6d2aed Author: Ard Biesheuvel Date: Thu Nov 28 13:55:31 2019 +0100 crypto: arm64/ghash-neon - bump priority to 150 commit 5441c6507bc84166e9227e9370a56c57ba13794a upstream. The SIMD based GHASH implementation for arm64 is typically much faster than the generic one, and doesn't use any lookup tables, so it is clearly preferred when available. So bump the priority to reflect that. Fixes: 5a22b198cd527447 ("crypto: arm64/ghash - register PMULL variants ...") Signed-off-by: Ard Biesheuvel Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 3a35871603a6a757af00a66abf2a7ca2d3199826 Author: Ard Biesheuvel Date: Wed Nov 27 13:01:36 2019 +0100 crypto: ccp - set max RSA modulus size for v3 platform devices as well commit 11548f5a5747813ff84bed6f2ea01100053b0d8d upstream. AMD Seattle incorporates a non-PCI version of the v3 CCP crypto accelerator, and this version was left behind when the maximum RSA modulus size was parameterized in order to support v5 hardware which supports larger moduli than v3 hardware does. Due to this oversight, RSA acceleration no longer works at all on these systems. Fix this by setting the .rsamax property to the appropriate value for v3 platform hardware. Fixes: e28c190db66830c0 ("csrypto: ccp - Expand RSA support for a v5 ccp") Cc: Gary R Hook Signed-off-by: Ard Biesheuvel Acked-by: Gary R Hook Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 58d8f2dec6ab15bfa5ece28fc408afb06de9fb1a Author: Jonathan Cameron Date: Tue Nov 19 13:42:57 2019 +0800 crypto: hisilicon - Use the offset fields in sqe to avoid need to split scatterlists commit 484a897ffa3005f16cd9a31efd747bcf8155826f upstream. We can configure sgl offset fields in ZIP sqe to let ZIP engine read/write sgl data with skipped data. Hence no need to splite the sgl. Fixes: 62c455ca853e (crypto: hisilicon - add HiSilicon ZIP accelerator support) Signed-off-by: Jonathan Cameron Signed-off-by: Zhou Wang Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit a791fc62a5749a8511cfdaf60e89923dc7648531 Author: Herbert Xu Date: Wed Dec 11 10:50:11 2019 +0800 crypto: api - fix unexpectedly getting generic implementation commit 2bbb3375d967155bccc86a5887d4a6e29c56b683 upstream. When CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y, the first lookup of an algorithm that needs to be instantiated using a template will always get the generic implementation, even when an accelerated one is available. This happens because the extra self-tests for the accelerated implementation allocate the generic implementation for comparison purposes, and then crypto_alg_tested() for the generic implementation "fulfills" the original request (i.e. sets crypto_larval::adult). This patch fixes this by only fulfilling the original request if we are currently the best outstanding larval as judged by the priority. If we're not the best then we will ask all waiters on that larval request to retry the lookup. Note that this patch introduces a behaviour change when the module providing the new algorithm is unregistered during the process. Previously we would have failed with ENOENT, after the patch we will instead redo the lookup. Fixes: 9a8a6b3f0950 ("crypto: testmgr - fuzz hashes against...") Fixes: d435e10e67be ("crypto: testmgr - fuzz skciphers against...") Fixes: 40153b10d91c ("crypto: testmgr - fuzz AEADs against...") Reported-by: Eric Biggers Signed-off-by: Herbert Xu Reviewed-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 1f5f3f65f956d12e86f86a248dd908ef55c4e2be Author: Lorenz Bauer Date: Fri Jan 24 11:27:52 2020 +0000 selftests: bpf: Ignore FIN packets for reuseport tests commit 8bec4f665e0baecb5f1b683379fc10b3745eb612 upstream. The reuseport tests currently suffer from a race condition: FIN packets count towards DROP_ERR_SKB_DATA, since they don't contain a valid struct cmd. Tests will spuriously fail depending on whether check_results is called before or after the FIN is processed. Exit the BPF program early if FIN is set. Fixes: 91134d849a0e ("bpf: Test BPF_PROG_TYPE_SK_REUSEPORT") Signed-off-by: Lorenz Bauer Signed-off-by: Daniel Borkmann Reviewed-by: Jakub Sitnicki Acked-by: Martin KaFai Lau Acked-by: John Fastabend Link: https://lore.kernel.org/bpf/20200124112754.19664-3-lmb@cloudflare.com Signed-off-by: Greg Kroah-Hartman commit 44a522bf5edc877b34ea07997fbcb61e9045c881 Author: Lorenz Bauer Date: Fri Jan 24 11:27:51 2020 +0000 selftests: bpf: Use a temporary file in test_sockmap commit c31dbb1e41d1857b403f9bf58c87f5898519a0bc upstream. Use a proper temporary file for sendpage tests. This means that running the tests doesn't clutter the working directory, and allows running the test on read-only filesystems. Fixes: 16962b2404ac ("bpf: sockmap, add selftests") Signed-off-by: Lorenz Bauer Signed-off-by: Daniel Borkmann Reviewed-by: Jakub Sitnicki Acked-by: Martin KaFai Lau Acked-by: John Fastabend Link: https://lore.kernel.org/bpf/20200124112754.19664-2-lmb@cloudflare.com Signed-off-by: Greg Kroah-Hartman commit da43712a7262891317883d4b3a909fb18dac4b1d Author: Hangbin Liu Date: Fri Jan 17 18:06:56 2020 +0800 selftests/bpf: Skip perf hw events test if the setup disabled it commit f1c3656c6d9c147d07d16614455aceb34932bdeb upstream. The same with commit 4e59afbbed96 ("selftests/bpf: skip nmi test when perf hw events are disabled"), it would make more sense to skip the test_stacktrace_build_id_nmi test if the setup (e.g. virtual machines) has disabled hardware perf events. Fixes: 13790d1cc72c ("bpf: add selftest for stackmap with build_id in NMI context") Signed-off-by: Hangbin Liu Signed-off-by: Daniel Borkmann Acked-by: John Fastabend Link: https://lore.kernel.org/bpf/20200117100656.10359-1-liuhangbin@gmail.com Signed-off-by: Greg Kroah-Hartman commit 49437ecf9f3089ec3891ff5a50a21b0439f6b6fa Author: Alexei Starovoitov Date: Wed Dec 18 18:04:42 2019 -0800 selftests/bpf: Fix test_attach_probe commit 580205dd4fe800b1e95be8b6df9e2991f975a8ad upstream. Fix two issues in test_attach_probe: 1. it was not able to parse /proc/self/maps beyond the first line, since %s means parse string until white space. 2. offset has to be accounted for otherwise uprobed address is incorrect. Fixes: 1e8611bbdfc9 ("selftests/bpf: add kprobe/uprobe selftests") Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Acked-by: Yonghong Song Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/20191219020442.1922617-1-ast@kernel.org Signed-off-by: Greg Kroah-Hartman commit c0ada6ad3ec5f9c02d04f49ba9fc115732bfa0ae Author: Jesper Dangaard Brouer Date: Fri Dec 20 17:19:36 2019 +0100 samples/bpf: Xdp_redirect_cpu fix missing tracepoint attach commit f9e6bfdbaf0cf304d72c70a05d81acac01a04f48 upstream. When sample xdp_redirect_cpu was converted to use libbpf, the tracepoints used by this sample were not getting attached automatically like with bpf_load.c. The BPF-maps was still getting loaded, thus nobody notice that the tracepoints were not updating these maps. This fix doesn't use the new skeleton code, as this bug was introduced in v5.1 and stable might want to backport this. E.g. Red Hat QA uses this sample as part of their testing. Fixes: bbaf6029c49c ("samples/bpf: Convert XDP samples to libbpf usage") Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Alexei Starovoitov Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/157685877642.26195.2798780195186786841.stgit@firesoul Signed-off-by: Greg Kroah-Hartman commit a69af866bd35272ac2caf62eba5704cd36dbd5d9 Author: Toke Høiland-Jørgensen Date: Mon Jan 20 14:06:41 2020 +0100 samples/bpf: Don't try to remove user's homedir on clean commit b2e5e93ae8af6a34bca536cdc4b453ab1e707b8b upstream. The 'clean' rule in the samples/bpf Makefile tries to remove backup files (ending in ~). However, if no such files exist, it will instead try to remove the user's home directory. While the attempt is mostly harmless, it does lead to a somewhat scary warning like this: rm: cannot remove '~': Is a directory Fix this by using find instead of shell expansion to locate any actual backup files that need to be removed. Fixes: b62a796c109c ("samples/bpf: allow make to be run from samples/bpf/ directory") Signed-off-by: Toke Høiland-Jørgensen Signed-off-by: Alexei Starovoitov Acked-by: Jesper Dangaard Brouer Link: https://lore.kernel.org/bpf/157952560126.1683545.7273054725976032511.stgit@toke.dk Signed-off-by: Greg Kroah-Hartman commit fbee8f61747fd82ad9e2bc95099d203871f53306 Author: Davide Caratti Date: Mon Feb 3 16:29:29 2020 +0100 tc-testing: fix eBPF tests failure on linux fresh clones commit 7145fcfffef1fad4266aaf5ca96727696916edb7 upstream. when the following command is done on a fresh clone of the kernel tree, [root@f31 tc-testing]# ./tdc.py -c bpf test cases that need to build the eBPF sample program fail systematically, because 'buildebpfPlugin' is unable to install the kernel headers (i.e, the 'khdr' target fails). Pass the correct environment to 'make', in place of ENVIR, to allow running these tests. Fixes: 4c2d39bd40c1 ("tc-testing: use a plugin to build eBPF program") Signed-off-by: Davide Caratti Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f7a2ccc00a364ff5b380933101836dfb2304390c Author: Andrii Nakryiko Date: Fri Jan 24 12:18:46 2020 -0800 libbpf: Fix realloc usage in bpf_core_find_cands commit 35b9211c0a2427e8f39e534f442f43804fc8d5ca upstream. Fix bug requesting invalid size of reallocated array when constructing CO-RE relocation candidate list. This can cause problems if there are many potential candidates and a very fine-grained memory allocator bucket sizes are used. Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: William Smith Signed-off-by: Andrii Nakryiko Signed-off-by: Daniel Borkmann Acked-by: Yonghong Song Link: https://lore.kernel.org/bpf/20200124201847.212528-1-andriin@fb.com Signed-off-by: Greg Kroah-Hartman commit ab48c14a444b9198e91435526b68bfdf2613c2d0 Author: Amol Grover Date: Thu Jan 23 17:34:38 2020 +0530 bpf, devmap: Pass lockdep expression to RCU lists commit 485ec2ea9cf556e9c120e07961b7b459d776a115 upstream. head is traversed using hlist_for_each_entry_rcu outside an RCU read-side critical section but under the protection of dtab->index_lock. Hence, add corresponding lockdep expression to silence false-positive lockdep warnings, and harden RCU lists. Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") Signed-off-by: Amol Grover Signed-off-by: Daniel Borkmann Acked-by: Jesper Dangaard Brouer Acked-by: Toke Høiland-Jørgensen Link: https://lore.kernel.org/bpf/20200123120437.26506-1-frextrite@gmail.com Signed-off-by: Greg Kroah-Hartman commit 77bb53cb094828a31cd3c5b402899810f63073c1 Author: Andrii Nakryiko Date: Wed Dec 11 17:36:20 2019 -0800 selftests/bpf: Fix perf_buffer test on systems w/ offline CPUs commit 91cbdf740a476cf2c744169bf407de2e3ac1f3cf upstream. Fix up perf_buffer.c selftest to take into account offline/missing CPUs. Fixes: ee5cf82ce04a ("selftests/bpf: test perf buffer API") Signed-off-by: Andrii Nakryiko Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191212013621.1691858-1-andriin@fb.com Signed-off-by: Greg Kroah-Hartman commit 5b0e9b563c01f0f3bb1a00712bc99225fbb44d07 Author: Björn Töpel Date: Mon Dec 16 10:13:35 2019 +0100 riscv, bpf: Fix broken BPF tail calls commit f1003b787c00fbaa4b11619c6b23a885bfce8f07 upstream. The BPF JIT incorrectly clobbered the a0 register, and did not flag usage of s5 register when BPF stack was being used. Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G") Signed-off-by: Björn Töpel Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20191216091343.23260-2-bjorn.topel@gmail.com Signed-off-by: Greg Kroah-Hartman commit f3107a3c9b845db995d6bedee7a60d5309723a51 Author: Nikolay Borisov Date: Fri Jan 10 14:11:34 2020 +0200 btrfs: Handle another split brain scenario with metadata uuid feature commit 05840710149c7d1a78ea85a2db5723f706e97d8f upstream. There is one more cases which isn't handled by the original metadata uuid work. Namely, when a filesystem has METADATA_UUID incompat bit and the user decides to change the FSID to the original one e.g. have metadata_uuid and fsid match. In case of power failure while this operation is in progress we could end up in a situation where some of the disks have the incompat bit removed and the other half have both METADATA_UUID_INCOMPAT and FSID_CHANGING_IN_PROGRESS flags. This patch handles the case where a disk that has successfully changed its FSID such that it equals METADATA_UUID is scanned first. Subsequently when a disk with both METADATA_UUID_INCOMPAT/FSID_CHANGING_IN_PROGRESS flags is scanned find_fsid_changed won't be able to find an appropriate btrfs_fs_devices. This is done by extending find_fsid_changed to correctly find btrfs_fs_devices whose metadata_uuid/fsid are the same and they match the metadata_uuid of the currently scanned device. Fixes: cc5de4e70256 ("btrfs: Handle final split-brain possibility during fsid change") Reviewed-by: Josef Bacik Reported-by: Su Yue Signed-off-by: Nikolay Borisov Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit dd9837259de0734786db8b33d17865095789a0f7 Author: Josef Bacik Date: Fri Jan 3 10:38:44 2020 -0500 btrfs: fix improper setting of scanned for range cyclic write cache pages commit 556755a8a99be8ca3cd9fbe36aaf9b3b0339a00d upstream. We noticed that we were having regular CG OOM kills in cases where there was still enough dirty pages to avoid OOM'ing. It turned out there's this corner case in btrfs's handling of range_cyclic where files that were being redirtied were not getting fully written out because of how we do range_cyclic writeback. We unconditionally were setting scanned = 1; the first time we found any pages in the inode. This isn't actually what we want, we want it to be set if we've scanned the entire file. For range_cyclic we could be starting in the middle or towards the end of the file, so we could write one page and then not write any of the other dirty pages in the file because we set scanned = 1. Fix this by not setting scanned = 1 if we find pages. The rules for setting scanned should be 1) !range_cyclic. In this case we have a specified range to write out. 2) range_cyclic && index == 0. In this case we've started at the beginning and there is no need to loop around a second time. 3) range_cyclic && we started at index > 0 and we've reached the end of the file without satisfying our nr_to_write. This patch fixes both of our writepages implementations to make sure these rules hold true. This fixed our over zealous CG OOMs in production. Fixes: d1310b2e0cd9 ("Btrfs: Split the extent_map code into two parts") Signed-off-by: Josef Bacik Reviewed-by: David Sterba [ add comment ] Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit b4c8ed0bf977760a206997b6429a7ac91978f440 Author: Herbert Xu Date: Tue Nov 26 15:58:45 2019 +0800 crypto: pcrypt - Avoid deadlock by using per-instance padata queues commit bbefa1dd6a6d53537c11624752219e39959d04fb upstream. If the pcrypt template is used multiple times in an algorithm, then a deadlock occurs because all pcrypt instances share the same padata_instance, which completes requests in the order submitted. That is, the inner pcrypt request waits for the outer pcrypt request while the outer request is already waiting for the inner. This patch fixes this by allocating a set of queues for each pcrypt instance instead of using two global queues. In order to maintain the existing user-space interface, the pinst structure remains global so any sysfs modifications will apply to every pcrypt instance. Note that when an update occurs we have to allocate memory for every pcrypt instance. Should one of the allocations fail we will abort the update without rolling back changes already made. The new per-instance data structure is called padata_shell and is essentially a wrapper around parallel_data. Reproducer: #include #include #include int main() { struct sockaddr_alg addr = { .salg_type = "aead", .salg_name = "pcrypt(pcrypt(rfc4106-gcm-aesni))" }; int algfd, reqfd; char buf[32] = { 0 }; algfd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(algfd, (void *)&addr, sizeof(addr)); setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, 20); reqfd = accept(algfd, 0, 0); write(reqfd, buf, 32); read(reqfd, buf, 16); } Reported-by: syzbot+56c7151cad94eec37c521f0e47d2eee53f9361c4@syzkaller.appspotmail.com Fixes: 5068c7a883d1 ("crypto: pcrypt - Add pcrypt crypto parallelization wrapper") Signed-off-by: Herbert Xu Tested-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit c8e9dafe668d9c50e1b1287ac2375497d747f367 Author: Steven Rostedt (VMware) Date: Wed Feb 5 09:20:32 2020 -0500 ftrace: Protect ftrace_graph_hash with ftrace_sync [ Upstream commit 54a16ff6f2e50775145b210bcd94d62c3c2af117 ] As function_graph tracer can run when RCU is not "watching", it can not be protected by synchronize_rcu() it requires running a task on each CPU before it can be freed. Calling schedule_on_each_cpu(ftrace_sync) needs to be used. Link: https://lore.kernel.org/r/20200205131110.GT2935@paulmck-ThinkPad-P72 Cc: stable@vger.kernel.org Fixes: b9b0c831bed26 ("ftrace: Convert graph filter to use hash tables") Reported-by: "Paul E. McKenney" Reviewed-by: Joel Fernandes (Google) Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin commit 6a652ed941aa3e31a645c4074ef2a80b2535f5b9 Author: Steven Rostedt (VMware) Date: Wed Feb 5 02:17:57 2020 -0500 ftrace: Add comment to why rcu_dereference_sched() is open coded [ Upstream commit 16052dd5bdfa16dbe18d8c1d4cde2ddab9d23177 ] Because the function graph tracer can execute in sections where RCU is not "watching", the rcu_dereference_sched() for the has needs to be open coded. This is fine because the RCU "flavor" of the ftrace hash is protected by its own RCU handling (it does its own little synchronization on every CPU and does not rely on RCU sched). Acked-by: Joel Fernandes (Google) Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin commit c9dc142b39a89e4bb29beaa92740413863fadca2 Author: Amol Grover Date: Wed Feb 5 11:27:02 2020 +0530 tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu [ Upstream commit fd0e6852c407dd9aefc594f54ddcc21d84803d3b ] Fix following instances of sparse error kernel/trace/ftrace.c:5667:29: error: incompatible types in comparison kernel/trace/ftrace.c:5813:21: error: incompatible types in comparison kernel/trace/ftrace.c:5868:36: error: incompatible types in comparison kernel/trace/ftrace.c:5870:25: error: incompatible types in comparison Use rcu_dereference_protected to dereference the newly annotated pointer. Link: http://lkml.kernel.org/r/20200205055701.30195-1-frextrite@gmail.com Signed-off-by: Amol Grover Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin commit 024537c7548fbbb570d80b14da7cba8c2fc48dd2 Author: Amol Grover Date: Sat Feb 1 12:57:04 2020 +0530 tracing: Annotate ftrace_graph_hash pointer with __rcu [ Upstream commit 24a9729f831462b1d9d61dc85ecc91c59037243f ] Fix following instances of sparse error kernel/trace/ftrace.c:5664:29: error: incompatible types in comparison kernel/trace/ftrace.c:5785:21: error: incompatible types in comparison kernel/trace/ftrace.c:5864:36: error: incompatible types in comparison kernel/trace/ftrace.c:5866:25: error: incompatible types in comparison Use rcu_dereference_protected to access the __rcu annotated pointer. Link: http://lkml.kernel.org/r/20200201072703.17330-1-frextrite@gmail.com Reviewed-by: Joel Fernandes (Google) Signed-off-by: Amol Grover Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin commit df57920d6e1800bfce270f32512e0c45408e0893 Author: Pierre-Louis Bossart Date: Fri Jan 24 15:36:21 2020 -0600 ASoC: SOF: core: release resources on errors in probe_continue [ Upstream commit 410e5e55c9c1c9c0d452ac5b9adb37b933a7747e ] The initial intent of releasing resources in the .remove does not work well with HDaudio codecs. If the probe_continue() fails in a work queue, e.g. due to missing firmware or authentication issues, we don't release any resources, and as a result the kernel oopses during suspend operations. The suggested fix is to release all resources during errors in probe_continue(), and use fw_state to track resource allocation state, so that .remove does not attempt to release the same hardware resources twice. PM operations are also modified so that no action is done if DSP resources have been freed due to an error at probe. Reported-by: Takashi Iwai Co-developed-by: Kai Vehmanen Signed-off-by: Kai Vehmanen Bugzilla: http://bugzilla.suse.com/show_bug.cgi?id=1161246 Signed-off-by: Pierre-Louis Bossart Reviewed-by: Takashi Iwai Link: https://lore.kernel.org/r/20200124213625.30186-4-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 3145862d8f9eb26c609191e66ab364f90d0dad93 Author: Ranjani Sridharan Date: Tue Dec 17 18:26:09 2019 -0600 ASoC: SOF: Introduce state machine for FW boot [ Upstream commit 6ca5cecbd1c1758666ab79446f19e0e61ed11444 ] Add a state machine for FW boot to track the different stages of FW boot and replace the boot_complete field with fw_state field in struct snd_sof_dev. This will be used to determine the actions to be performed during system suspend. One of the main motivations for adding this change is the fact that errors during the top-level SOF device probe cannot be propagated and therefore suspending the SOF device normally during system suspend could potentially run into errors. For example, with the current flow, if the FW boot failed for some reason and the system suspends, the SOF device suspend could fail because the CTX_SAVE IPC would be attempted even though the FW never really booted successfully causing it to time out. Another scenario that the state machine fixes is when the runtime suspend for the SOF device fails and the DSP is powered down nevertheless, the CTX_SAVE IPC during system suspend would timeout because the DSP is already powered down. Reviewed-by: Curtis Malainey Reviewed-by: Daniel Baluta Signed-off-by: Ranjani Sridharan Signed-off-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20191218002616.7652-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 0b84591fdd5ea3ca0d93aaea489353f0381832c0 Author: Quinn Tran Date: Tue Dec 17 14:06:11 2019 -0800 scsi: qla2xxx: Fix stuck login session using prli_pend_timer [ Upstream commit 8aaac2d7da873aebeba92c666f82c00bbd74aaf9 ] Session is stuck if driver sees FW has received a PRLI. Driver allows FW to finish with processing of PRLI by checking back with FW at a later time to see if the PRLI has finished. Instead, driver failed to push forward after re-checking PRLI completion. Fixes: ce0ba496dccf ("scsi: qla2xxx: Fix stuck login session") Cc: stable@vger.kernel.org # 5.3 Link: https://lore.kernel.org/r/20191217220617.28084-9-hmadhani@marvell.com Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 78cbd2c397bfca30c1b967fd3c1c65933a3d757e Author: Mike Snitzer Date: Mon Jan 27 14:07:23 2020 -0500 dm: fix potential for q->make_request_fn NULL pointer commit 47ace7e012b9f7ad71d43ac9063d335ea3d6820b upstream. Move blk_queue_make_request() to dm.c:alloc_dev() so that q->make_request_fn is never NULL during the lifetime of a DM device (even one that is created without a DM table). Otherwise generic_make_request() will crash simply by doing: dmsetup create -n test mount /dev/dm-N /mnt While at it, move ->congested_data initialization out of dm.c:alloc_dev() and into the bio-based specific init method. Reported-by: Stefan Bader BugLink: https://bugs.launchpad.net/bugs/1860231 Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper") Depends-on: c12c9a3c3860c ("dm: various cleanups to md->queue initialization code") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 1426201af047f233d98958e1862a13f66e9ce09f Author: Mike Snitzer Date: Mon Jan 13 11:18:51 2020 -0500 dm thin metadata: use pool locking at end of dm_pool_metadata_close commit 44d8ebf436399a40fcd10dd31b29d37823d62fcc upstream. Ensure that the pool is locked during calls to __commit_transaction and __destroy_persistent_data_objects. Just being consistent with locking, but reality is dm_pool_metadata_close is called once pool is being destroyed so access to pool shouldn't be contended. Also, use pmd_write_lock_in_core rather than __pmd_write_lock in dm_pool_commit_metadata and rename __pmd_write_lock to pmd_write_lock_in_core -- there was no need for the alias. In addition, verify that the pool is locked in __commit_transaction(). Fixes: 873f258becca ("dm thin metadata: do not write metadata if no changes occurred") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 40d3d8d6eb64b8b2e7eed1487adeaf3092a248e1 Author: Milan Broz Date: Mon Jan 6 10:11:47 2020 +0100 dm crypt: fix benbi IV constructor crash if used in authenticated mode commit 4ea9471fbd1addb25a4d269991dc724e200ca5b5 upstream. If benbi IV is used in AEAD construction, for example: cryptsetup luksFormat --cipher twofish-xts-benbi --key-size 512 --integrity=hmac-sha256 the constructor uses wrong skcipher function and crashes: BUG: kernel NULL pointer dereference, address: 00000014 ... EIP: crypt_iv_benbi_ctr+0x15/0x70 [dm_crypt] Call Trace: ? crypt_subkey_size+0x20/0x20 [dm_crypt] crypt_ctr+0x567/0xfc0 [dm_crypt] dm_table_add_target+0x15f/0x340 [dm_mod] Fix this by properly using crypt_aead_blocksize() in this case. Fixes: ef43aa38063a6 ("dm crypt: add cryptographic data integrity protection (authenticated encryption)") Cc: stable@vger.kernel.org # v4.12+ Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=941051 Reported-by: Jerad Simpson Signed-off-by: Milan Broz Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit b805ec7d089f20712a1fbb4a96e41c7142b22048 Author: Mikulas Patocka Date: Thu Jan 2 08:23:32 2020 -0500 dm crypt: fix GFP flags passed to skcipher_request_alloc() commit 9402e959014a18b4ebf7558733076875808dd66c upstream. GFP_KERNEL is not supposed to be or'd with GFP_NOFS (the result is equivalent to GFP_KERNEL). Also, we use GFP_NOIO instead of GFP_NOFS because we don't want any I/O being submitted in the direct reclaim path. Fixes: 39d13a1ac41d ("dm crypt: reuse eboiv skcipher for IV generation") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 1781fa54a4eaf7cf2cd5924c5833f7b06efffae7 Author: Mikulas Patocka Date: Wed Jan 8 10:46:05 2020 -0500 dm writecache: fix incorrect flush sequence when doing SSD mode commit commit aa9509209c5ac2f0b35d01a922bf9ae072d0c2fc upstream. When committing state, the function writecache_flush does the following: 1. write metadata (writecache_commit_flushed) 2. flush disk cache (writecache_commit_flushed) 3. wait for data writes to complete (writecache_wait_for_ios) 4. increase superblock seq_count 5. write the superblock 6. flush disk cache It may happen that at step 3, when we wait for some write to finish, the disk may report the write as finished, but the write only hit the disk cache and it is not yet stored in persistent storage. At step 5 we write the superblock - it may happen that the superblock is written before the write that we waited for in step 3. If the machine crashes, it may result in incorrect data being returned after reboot. In order to fix the bug, we must swap steps 2 and 3 in the above sequence, so that we first wait for writes to complete and then flush the disk cache. Fixes: 48debafe4f2f ("dm: add writecache target") Cc: stable@vger.kernel.org # 4.18+ Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit a8d99d630117c2b933ef6058b722c92d2b0dbb24 Author: Joe Thornber Date: Tue Jan 7 11:58:42 2020 +0000 dm space map common: fix to ensure new block isn't already in use commit 4feaef830de7ffdd8352e1fe14ad3bf13c9688f8 upstream. The space-maps track the reference counts for disk blocks allocated by both the thin-provisioning and cache targets. There are variants for tracking metadata blocks and data blocks. Transactionality is implemented by never touching blocks from the previous transaction, so we can rollback in the event of a crash. When allocating a new block we need to ensure the block is free (has reference count of 0) in both the current and previous transaction. Prior to this fix we were doing this by searching for a free block in the previous transaction, and relying on a 'begin' counter to track where the last allocation in the current transaction was. This 'begin' field was not being updated in all code paths (eg, increment of a data block reference count due to breaking sharing of a neighbour block in the same btree leaf). This fix keeps the 'begin' field, but now it's just a hint to speed up the search. Instead the current transaction is searched for a free block, and then the old transaction is double checked to ensure it's free. Much simpler. This fixes reports of sm_disk_new_block()'s BUG_ON() triggering when DM thin-provisioning's snapshots are heavily used. Reported-by: Eric Wheeler Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit 188f9b710ff1601b1c694975494cfc7782b4ad54 Author: Dmitry Fomichev Date: Mon Dec 23 17:05:46 2019 -0800 dm zoned: support zone sizes smaller than 128MiB commit b39962950339912978484cdac50069258545d753 upstream. dm-zoned is observed to log failed kernel assertions and not work correctly when operating against a device with a zone size smaller than 128MiB (e.g. 32768 bits per 4K block). The reason is that the bitmap size per zone is calculated as zero with such a small zone size. Fix this problem and also make the code related to zone bitmap management be able to handle per zone bitmaps smaller than a single block. A dm-zoned-tools patch is required to properly format dm-zoned devices with zone sizes smaller than 128MiB. Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Fomichev Reviewed-by: Damien Le Moal Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman commit ad7c38abe8ba4954e89c443bfb3ca260045635e5 Author: Chen-Yu Tsai Date: Mon Feb 3 17:37:48 2020 -0800 ARM: dma-api: fix max_pfn off-by-one error in __dma_supported() commit f3cc4e1d44a813a0685f2e558b78ace3db559722 upstream. max_pfn, as set in arch/arm/mm/init.c: static void __init find_limits(unsigned long *min, unsigned long *max_low, unsigned long *max_high) { *max_low = PFN_DOWN(memblock_get_current_limit()); *min = PFN_UP(memblock_start_of_DRAM()); *max_high = PFN_DOWN(memblock_end_of_DRAM()); } with memblock_end_of_DRAM() pointing to the next byte after DRAM. As such, max_pfn points to the PFN after the end of DRAM. Thus when using max_pfn to check DMA masks, we should subtract one when checking DMA ranges against it. Commit 8bf1268f48ad ("ARM: dma-api: fix off-by-one error in __dma_supported()") fixed the same issue, but missed this spot. This issue was found while working on the sun4i-csi v4l2 driver on the Allwinner R40 SoC. On Allwinner SoCs, DRAM is offset at 0x40000000, and we are starting to use of_dma_configure() with the "dma-ranges" property in the device tree to have the DMA API handle the offset. In this particular instance, dma-ranges was set to the same range as the actual available (2 GiB) DRAM. The following error appeared when the driver attempted to allocate a buffer: sun4i-csi 1c09000.csi: Coherent DMA mask 0x7fffffff (pfn 0x40000-0xc0000) covers a smaller range of system memory than the DMA zone pfn 0x0-0xc0001 sun4i-csi 1c09000.csi: dma_alloc_coherent of size 307200 failed Fixing the off-by-one error makes things work. Link: http://lkml.kernel.org/r/20191224030239.5656-1-wens@kernel.org Fixes: 11a5aa32562e ("ARM: dma-mapping: check DMA mask against available memory") Fixes: 9f28cde0bc64 ("ARM: another fix for the DMA mapping checks") Fixes: ab746573c405 ("ARM: dma-mapping: allow larger DMA mask than supported") Signed-off-by: Chen-Yu Tsai Reviewed-by: Christoph Hellwig Cc: Russell King Cc: Robin Murphy Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit bae74e7ac8421650c6f49b4523d9497c7afe0360 Author: Michael Ellerman Date: Sun Jan 26 22:52:47 2020 +1100 of: Add OF_DMA_DEFAULT_COHERENT & select it on powerpc commit dabf6b36b83a18d57e3d4b9d50544ed040d86255 upstream. There's an OF helper called of_dma_is_coherent(), which checks if a device has a "dma-coherent" property to see if the device is coherent for DMA. But on some platforms devices are coherent by default, and on some platforms it's not possible to update existing device trees to add the "dma-coherent" property. So add a Kconfig symbol to allow arch code to tell of_dma_is_coherent() that devices are coherent by default, regardless of the presence of the property. Select that symbol on powerpc when NOT_COHERENT_CACHE is not set, ie. when the system has a coherent cache. Fixes: 92ea637edea3 ("of: introduce of_dma_is_coherent() helper") Cc: stable@vger.kernel.org # v3.16+ Reported-by: Christian Zigotzky Tested-by: Christian Zigotzky Signed-off-by: Michael Ellerman Reviewed-by: Ulf Hansson Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit f5f68d165dc0c8603d386ac272fbea0a1609c0de Author: Rafael J. Wysocki Date: Sun Jan 26 23:40:11 2020 +0100 cpufreq: Avoid creating excessively large stack frames commit 1e4f63aecb53e48468661e922fc2fa3b83e55722 upstream. In the process of modifying a cpufreq policy, the cpufreq core makes a copy of it including all of the internals which is stored on the CPU stack. Because struct cpufreq_policy is relatively large, this may cause the size of the stack frame to exceed the 2 KB limit and so the GCC complains when -Wframe-larger-than= is used. In fact, it is not necessary to copy the entire policy structure in order to modify it, however. First, because cpufreq_set_policy() obtains the min and max policy limits from frequency QoS now, it is not necessary to pass the limits to it from the callers. The only things that need to be passed to it from there are the new governor pointer or (if there is a built-in governor in the driver) the "policy" value representing the governor choice. They both can be passed as individual arguments, though, so make cpufreq_set_policy() take them this way and rework its callers accordingly. This avoids making copies of cpufreq policies in the callers of cpufreq_set_policy(). Second, cpufreq_set_policy() still needs to pass the new policy data to the ->verify() callback of the cpufreq driver whose task is to sanitize the min and max policy limits. It still does not need to make a full copy of struct cpufreq_policy for this purpose, but it needs to pass a few items from it to the driver in case they are needed (different drivers have different needs in that respect and all of them have to be covered). For this reason, introduce struct cpufreq_policy_data to hold copies of the members of struct cpufreq_policy used by the existing ->verify() driver callbacks and pass a pointer to a temporary structure of that type to ->verify() (instead of passing a pointer to full struct cpufreq_policy to it). While at it, notice that intel_pstate and longrun don't really need to verify the "policy" value in struct cpufreq_policy, so drop those check from them to avoid copying "policy" into struct cpufreq_policy_data (which allows it to be slightly smaller). Also while at it fix up white space in a couple of places and make cpufreq_set_policy() static (as it can be so). Fixes: 3000ce3c52f8 ("cpufreq: Use per-policy frequency QoS") Link: https://lore.kernel.org/linux-pm/CAMuHMdX6-jb1W8uC2_237m8ctCpsnGp=JCxqt8pCWVqNXHmkVg@mail.gmail.com Reported-by: kbuild test robot Reported-by: Geert Uytterhoeven Cc: 5.4+ # 5.4+ Signed-off-by: Rafael J. Wysocki Acked-by: Viresh Kumar Signed-off-by: Greg Kroah-Hartman commit 7dce99d3182a1495bd14cac8403ee471ecdb7ea4 Author: Rafael J. Wysocki Date: Thu Jan 23 00:11:24 2020 +0100 PM: core: Fix handling of devices deleted during system-wide resume commit 0552e05fdfea191a2cf3a0abd33574b5ef9ca818 upstream. If a device is deleted by one of its system-wide resume callbacks (for example, because it does not appear to be present or accessible any more) along with its children, the resume of the children may continue leading to use-after-free errors and other issues (potentially). Namely, if the device's children are resumed asynchronously, their resume may have been scheduled already before the device's callback runs and so the device may be deleted while dpm_wait_for_superior() is being executed for them. The memory taken up by the parent device object may be freed then while dpm_wait() is waiting for the parent's resume callback to complete, which leads to a use-after-free. Moreover, the resume of the children is really not expected to continue after they have been unregistered, so it must be terminated right away in that case. To address this problem, modify dpm_wait_for_superior() to check if the target device is still there in the system-wide PM list of devices and if so, to increment its parent's reference counter, both under dpm_list_mtx which prevents device_del() running for the child from dropping the parent's reference counter prematurely. If the device is not present in the system-wide PM list of devices any more, the resume of it cannot continue, so check that again after dpm_wait() returns, which means that the parent's callback has been completed, and pass the result of that check to the caller of dpm_wait_for_superior() to allow it to abort the device's resume if it is not there any more. Link: https://lore.kernel.org/linux-pm/1579568452-27253-1-git-send-email-chanho.min@lge.com Reported-by: Chanho Min Cc: All applicable Signed-off-by: Rafael J. Wysocki Acked-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit e9116299ffac758111547667f9bc8a0f284c5d93 Author: Eric Biggers Date: Thu Jan 23 20:15:49 2020 -0800 f2fs: fix race conditions in ->d_compare() and ->d_hash() commit 80f2388afa6ef985f9c5c228e36705c4d4db4756 upstream. Since ->d_compare() and ->d_hash() can be called in RCU-walk mode, ->d_parent and ->d_inode can be concurrently modified, and in particular, ->d_inode may be changed to NULL. For f2fs_d_hash() this resulted in a reproducible NULL dereference if a lookup is done in a directory being deleted, e.g. with: int main() { if (fork()) { for (;;) { mkdir("subdir", 0700); rmdir("subdir"); } } else { for (;;) access("subdir/file", 0); } } ... or by running the 't_encrypted_d_revalidate' program from xfstests. Both repros work in any directory on a filesystem with the encoding feature, even if the directory doesn't actually have the casefold flag. I couldn't reproduce a crash in f2fs_d_compare(), but it appears that a similar crash is possible there. Fix these bugs by reading ->d_parent and ->d_inode using READ_ONCE() and falling back to the case sensitive behavior if the inode is NULL. Reported-by: Al Viro Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups") Cc: # v5.4+ Signed-off-by: Eric Biggers Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit 6d722cd2e387921316ce52d1de9d4dbd88972ad8 Author: Eric Biggers Date: Thu Jan 23 20:15:48 2020 -0800 f2fs: fix dcache lookup of !casefolded directories commit 5515eae647426169e4b7969271fb207881eba7f6 upstream. Do the name comparison for non-casefolded directories correctly. This is analogous to ext4's commit 66883da1eee8 ("ext4: fix dcache lookup of !casefolded directories"). Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups") Cc: # v5.4+ Signed-off-by: Eric Biggers Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit f4803553aae6a8c315abe4be2d83a58f2f7af642 Author: Chengguang Xu Date: Sat Jan 4 22:20:04 2020 +0800 f2fs: code cleanup for f2fs_statfs_project() commit bf2cbd3c57159c2b639ee8797b52ab5af180bf83 upstream. Calling min_not_zero() to simplify complicated prjquota limit comparison in f2fs_statfs_project(). Signed-off-by: Chengguang Xu Reviewed-by: Chao Yu Signed-off-by: Jaegeuk Kim Signed-off-by: Greg Kroah-Hartman commit b1de9ec0e78866b252a32fee126af22ec8d42603 Author: Chengguang Xu Date: Sat Jan 4 22:20:03 2020 +0800 f2fs: fix miscounted block limit in f2fs_statfs_project() commit acdf2172172a511f97fa21ed0ee7609a6d3b3a07 upstream. statfs calculates Total/Used/Avail disk space in block unit, so we should translate soft/hard prjquota limit to block unit as well. Below testing result shows the block/inode numbers of Total/Used/Avail from df command are all correct afer applying this patch. [root@localhost quota-tools]\# ./repquota -P /dev/sdb1 commit ae2cb41583a94cb9ca2cf209e03ad6d790a29c3c Author: Chengguang Xu Date: Mon Nov 25 11:20:36 2019 +0800 f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project() commit 909110c060f22e65756659ec6fa957ae75777e00 upstream. Setting softlimit larger than hardlimit seems meaningless for disk quota but currently it is allowed. In this case, there may be a bit of comfusion for users when they run df comamnd to directory which has project quota. For example, we set 20M softlimit and 10M hardlimit of block usage limit for project quota of test_dir(project id 123). [root@hades f2fs]# repquota -P -a commit 08846286bf2854c0d4326315af260735759d6933 Author: Miklos Szeredi Date: Mon Feb 3 11:41:53 2020 +0100 ovl: fix lseek overflow on 32bit commit a4ac9d45c0cd14a2adc872186431c79804b77dbf upstream. ovl_lseek() is using ssize_t to return the value from vfs_llseek(). On a 32-bit kernel ssize_t is a 32-bit signed int, which overflows above 2 GB. Assign the return value of vfs_llseek() to loff_t to fix this. Reported-by: Boris Gjenero Fixes: 9e46b840c705 ("ovl: support stacked SEEK_HOLE/SEEK_DATA") Cc: # v4.19 Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit 809e16a6eeb39b263f13bee53a6c8c1befc4aad0 Author: Amir Goldstein Date: Sun Dec 22 22:47:54 2019 +0200 ovl: fix wrong WARN_ON() in ovl_cache_update_ino() commit 4c37e71b713ecffe81f8e6273c6835e54306d412 upstream. The WARN_ON() that child entry is always on overlay st_dev became wrong when we allowed this function to update d_ino in non-samefs setup with xino enabled. It is not true in case of xino bits overflow on a non-dir inode. Leave the WARN_ON() only for directories, where assertion is still true. Fixes: adbf4f7ea834 ("ovl: consistent d_ino for non-samefs with xino") Cc: # v4.17+ Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit 03572189d61c3bc09ba69a379af8104e37c7f765 Author: Sven Van Asbroeck Date: Thu Sep 19 11:11:37 2019 -0400 power: supply: ltc2941-battery-gauge: fix use-after-free commit a60ec78d306c6548d4adbc7918b587a723c555cc upstream. This driver's remove path calls cancel_delayed_work(). However, that function does not wait until the work function finishes. This could mean that the work function is still running after the driver's remove function has finished, which would result in a use-after-free. Fix by calling cancel_delayed_work_sync(), which ensures that that the work is properly cancelled, no longer running, and unable to re-schedule itself. This issue was detected with the help of Coccinelle. Cc: stable Signed-off-by: Sven Van Asbroeck Signed-off-by: Sebastian Reichel Signed-off-by: Greg Kroah-Hartman commit 6f33d59ae165d50f9007a24314c903fd320ce5c4 Author: Samuel Holland Date: Sun Jan 12 21:53:03 2020 -0600 power: supply: axp20x_ac_power: Fix reporting online status commit 1c51aad8475d670ad58ae60adc9d32342381df8d upstream. AXP803/AXP813 have a flag that enables/disables the AC power supply input. This flag does not affect the status bits in PWR_INPUT_STATUS. Its effect can be verified by checking the battery charge/discharge state (bit 2 of PWR_INPUT_STATUS), or by examining the current draw on the AC input. Take this flag into account when getting the ONLINE property of the AC input, on PMICs where this flag is present. Fixes: 7693b5643fd2 ("power: supply: add AC power supply driver for AXP813") Cc: stable@vger.kernel.org Signed-off-by: Samuel Holland Reviewed-by: Chen-Yu Tsai Signed-off-by: Sebastian Reichel Signed-off-by: Greg Kroah-Hartman commit 4eed5d3bb14a1e51f645413c3c6a8e328b007e3c Author: Thomas Renninger Date: Fri Jan 17 08:55:54 2020 +0100 cpupower: Revert library ABI changes from commit ae2917093fb60bdc1ed3e commit 41ddb7e1f79693d904502ae9bea609837973eff8 upstream. Commit ae2917093fb6 ("tools/power/cpupower: Display boost frequency separately") modified the library function: struct cpufreq_available_frequencies *cpufreq_get_available_frequencies(unsigned int cpu) to struct cpufreq_frequencies *cpufreq_get_frequencies(const char *type, unsigned int cpu) This patch recovers the old API and implements the new functionality in a newly introduce method: struct cpufreq_boost_frequencies *cpufreq_get_available_frequencies(unsigned int cpu) This one should get merged into stable kernels back to 5.0 when the above had been introduced. Fixes: ae2917093fb6 ("tools/power/cpupower: Display boost frequency separately") Cc: stable@vger.kernel.org Signed-off-by: Thomas Renninger Signed-off-by: Shuah Khan Signed-off-by: Greg Kroah-Hartman commit 2b27acfde9b28713cc8269ee5bf7a045798f5e95 Author: Quinn Tran Date: Tue Dec 17 14:06:16 2019 -0800 scsi: qla2xxx: Fix mtcp dump collection failure commit 641e0efddcbde52461e017136acd3ce7f2ef0c14 upstream. MTCP dump failed due to MB Reg 10 was picking garbage data from stack memory. Fixes: 81178772b636a ("[SCSI] qla2xxx: Implemetation of mctp.") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191217220617.28084-14-hmadhani@marvell.com Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit edd15b154653950f73b96563a6f9b5b0439c4714 Author: Anand Lodnoor Date: Tue Jan 14 16:51:19 2020 +0530 scsi: megaraid_sas: Do not initiate OCR if controller is not in ready state commit 6d7537270e3283b92f9b327da9d58a4de40fe8d0 upstream. Driver initiates OCR if a DCMD command times out. But there is a deadlock if the driver attempts to invoke another OCR before the mutex lock (reset_mutex) is released from the previous session of OCR. This patch takes care of the above scenario using new flag MEGASAS_FUSION_OCR_NOT_POSSIBLE to indicate if OCR is possible. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1579000882-20246-9-git-send-email-anand.lodnoor@broadcom.com Signed-off-by: Shivasharan S Signed-off-by: Anand Lodnoor Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 3728834fff19f8adca9bbb1324f13e2be0704fea Author: Gao Xiang Date: Tue Jan 7 10:25:46 2020 +0800 erofs: fix out-of-bound read for shifted uncompressed block commit 4d2024370d877f9ac8b98694bcff666da6a5d333 upstream. rq->out[1] should be valid before accessing. Otherwise, in very rare cases, out-of-bound dirty onstack rq->out[1] can equal to *in and lead to unintended memmove behavior. Link: https://lore.kernel.org/r/20200107022546.19432-1-gaoxiang25@huawei.com Fixes: 7fc45dbc938a ("staging: erofs: introduce generic decompression backend") Cc: # 5.3+ Reviewed-by: Chao Yu Signed-off-by: Gao Xiang Signed-off-by: Greg Kroah-Hartman commit 3ebbfdf41d268590350ff1ad11ec44f6d7e3b70e Author: Geert Uytterhoeven Date: Mon Jan 27 10:31:07 2020 +0100 scripts/find-unused-docs: Fix massive false positives commit 1630146db2111412e7524d05d812ff8f2c75977e upstream. scripts/find-unused-docs.sh invokes scripts/kernel-doc to find out if a source file contains kerneldoc or not. However, as it passes the no longer supported "-text" option to scripts/kernel-doc, the latter prints out its help text, causing all files to be considered containing kerneldoc. Get rid of these false positives by removing the no longer supported "-text" option from the scripts/kernel-doc invocation. Cc: stable@vger.kernel.org # 4.16+ Fixes: b05142675310d2ac ("scripts: kernel-doc: get rid of unused output formats") Signed-off-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20200127093107.26401-1-geert+renesas@glider.be Signed-off-by: Jonathan Corbet Signed-off-by: Greg Kroah-Hartman commit a421f513779cb0b84b818487959ef2eb8f812a2c Author: Filipe Manana Date: Mon Dec 16 18:26:55 2019 +0000 fs: allow deduplication of eof block into the end of the destination file commit a5e6ea18e3d132be4716eb5fdd520c2c234e3003 upstream. We always round down, to a multiple of the filesystem's block size, the length to deduplicate at generic_remap_check_len(). However this is only needed if an attempt to deduplicate the last block into the middle of the destination file is requested, since that leads into a corruption if the length of the source file is not block size aligned. When an attempt to deduplicate the last block into the end of the destination file is requested, we should allow it because it is safe to do it - there's no stale data exposure and we are prepared to compare the data ranges for a length not aligned to the block (or page) size - in fact we even do the data compare before adjusting the deduplication length. After btrfs was updated to use the generic helpers from VFS (by commit 34a28e3d77535e ("Btrfs: use generic_remap_file_range_prep() for cloning and deduplication")) we started to have user reports of deduplication not reflinking the last block anymore, and whence users getting lower deduplication scores. The main use case is deduplication of entire files that have a size not aligned to the block size of the filesystem. We already allow cloning the last block to the end (and beyond) of the destination file, so allow for deduplication as well. Link: https://lore.kernel.org/linux-btrfs/2019-1576167349.500456@svIo.N5dq.dFFD/ CC: stable@vger.kernel.org # 5.1+ Reviewed-by: Josef Bacik Reviewed-by: Darrick J. Wong Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 5fefc9b3e3584a1ce98da27c38e1b8dda1939d74 Author: Herbert Xu Date: Tue Nov 19 13:17:31 2019 +0800 padata: Remove broken queue flushing commit 07928d9bfc81640bab36f5190e8725894d93b659 upstream. The function padata_flush_queues is fundamentally broken because it cannot force padata users to complete the request that is underway. IOW padata has to passively wait for the completion of any outstanding work. As it stands flushing is used in two places. Its use in padata_stop is simply unnecessary because nothing depends on the queues to be flushed afterwards. The other use in padata_replace is more substantial as we depend on it to free the old pd structure. This patch instead uses the pd->refcnt to dynamically free the pd structure once all requests are complete. Fixes: 2b73b07ab8a4 ("padata: Flush the padata queues actively") Cc: Signed-off-by: Herbert Xu Reviewed-by: Daniel Jordan Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 5f63963669ec54ec5c36227e55f825f911bfaa90 Author: Gilad Ben-Yossef Date: Thu Jan 16 12:14:43 2020 +0200 crypto: ccree - fix PM race condition commit 15fd2566bf54ee4d4781d8f170acfc9472a1541f upstream. The PM code was racy, possibly causing the driver to submit requests to a powered down device. Fix the race and while at it simplify the PM code. Signed-off-by: Gilad Ben-Yossef Fixes: 1358c13a48c4 ("crypto: ccree - fix resume race condition on init") Cc: stable@kernel.org # v4.20 Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 5e33535cf13c0a85f2bda324fced2502f9ca82e9 Author: Ofir Drang Date: Thu Jan 16 12:14:42 2020 +0200 crypto: ccree - fix FDE descriptor sequence commit 5c83e8ec4d51ac4cc58482ed04297e6882b32a09 upstream. In FDE mode (xts, essiv and bitlocker) the cryptocell hardware requires that the the XEX key will be loaded after Key1. Signed-off-by: Ofir Drang Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit d8760030e7017558faa748ba62376be5ab6babf1 Author: Gilad Ben-Yossef Date: Thu Jan 16 12:14:40 2020 +0200 crypto: ccree - fix pm wrongful error reporting commit cedca59fae5834af8445b403c66c9953754375d7 upstream. pm_runtime_get_sync() can return 1 as a valid (none error) return code. Treat it as such. Signed-off-by: Gilad Ben-Yossef Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 70439e8b7ccfe70b810d98bb441c37e5c43ef8c1 Author: Gilad Ben-Yossef Date: Thu Jan 16 12:14:38 2020 +0200 crypto: ccree - fix AEAD decrypt auth fail commit 2a6bc713f1cef32e39e3c4e6f2e1a9849da6379c upstream. On AEAD decryption authentication failure we are suppose to zero out the output plaintext buffer. However, we've missed skipping the optional associated data that may prefix the ciphertext. This commit fixes this issue. Signed-off-by: Gilad Ben-Yossef Fixes: e88b27c8eaa8 ("crypto: ccree - use std api sg_zero_buffer") Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 80c660892c24c867ee5b5bb367229ccfb2f2829b Author: Gilad Ben-Yossef Date: Wed Nov 27 10:49:08 2019 +0200 crypto: ccree - fix backlog memory leak commit 4df2ef25b3b3618fd708ab484fe6239abd130fec upstream. Fix brown paper bag bug of not releasing backlog list item buffer when backlog was consumed causing a memory leak when backlog is used. Signed-off-by: Gilad Ben-Yossef Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit d2b1dcd5430f2fb32fd1278dd96a55692037d358 Author: Herbert Xu Date: Fri Dec 6 13:55:17 2019 +0800 crypto: api - Check spawn->alg under lock in crypto_drop_spawn commit 7db3b61b6bba4310f454588c2ca6faf2958ad79f upstream. We need to check whether spawn->alg is NULL under lock as otherwise the algorithm could be removed from under us after we have checked it and found it to be non-NULL. This could cause us to remove the spawn from a non-existent list. Fixes: 7ede5a5ba55a ("crypto: api - Fix crypto_drop_spawn crash...") Cc: Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 0ed43162f578352b48ffc2b2260b2a2530450be5 Author: Bitan Biswas Date: Thu Jan 9 10:40:17 2020 +0000 nvmem: core: fix memory abort in cleanup path commit 16bb7abc4a6b9defffa294e4dc28383e62a1dbcf upstream. nvmem_cell_info_to_nvmem_cell implementation has static allocation of name. nvmem_add_cells_from_of() call may return error and kfree name results in memory abort. Use kstrdup_const() and kfree_const calls for name alloc and free. Unable to handle kernel paging request at virtual address ffffffffffe44888 Mem abort info: ESR = 0x96000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 swapper pgtable: 64k pages, 48-bit VAs, pgdp=00000000815d0000 [ffffffffffe44888] pgd=0000000081d30803, pud=0000000081d30803, pmd=0000000000000000 Internal error: Oops: 96000006 [#1] PREEMPT SMP Modules linked in: CPU: 2 PID: 43 Comm: kworker/2:1 Tainted Hardware name: quill (DT) Workqueue: events deferred_probe_work_func pstate: a0000005 (NzCv daif -PAN -UAO) pc : kfree+0x38/0x278 lr : nvmem_cell_drop+0x68/0x80 sp : ffff80001284f9d0 x29: ffff80001284f9d0 x28: ffff0001f677e830 x27: ffff800011b0b000 x26: ffff0001c36e1008 x25: ffff8000112ad000 x24: ffff8000112c9000 x23: ffffffffffffffea x22: ffff800010adc7f0 x21: ffffffffffe44880 x20: ffff800011b0b068 x19: ffff80001122d380 x18: ffffffffffffffff x17: 00000000d5cb4756 x16: 0000000070b193b8 x15: ffff8000119538c8 x14: 0720072007200720 x13: 07200720076e0772 x12: 07750762072d0765 x11: 0773077507660765 x10: 072f073007300730 x9 : 0730073207380733 x8 : 0000000000000151 x7 : 07660765072f0720 x6 : ffff0001c00e0f00 x5 : 0000000000000000 x4 : ffff0001c0b43800 x3 : ffff800011b0b068 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffffdfffe00000 Call trace: kfree+0x38/0x278 nvmem_cell_drop+0x68/0x80 nvmem_device_remove_all_cells+0x2c/0x50 nvmem_register.part.9+0x520/0x628 devm_nvmem_register+0x48/0xa0 tegra_fuse_probe+0x140/0x1f0 platform_drv_probe+0x50/0xa0 really_probe+0x108/0x348 driver_probe_device+0x58/0x100 __device_attach_driver+0x90/0xb0 bus_for_each_drv+0x64/0xc8 __device_attach+0xd8/0x138 device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0x74/0xb0 process_one_work+0x1e0/0x358 worker_thread+0x208/0x488 kthread+0x118/0x120 ret_from_fork+0x10/0x18 Code: d350feb5 f2dffbe0 aa1e03f6 8b151815 (f94006a0) ---[ end trace 49b1303c6b83198e ]--- Fixes: badcdff107cbf ("nvmem: Convert to using %pOFn instead of device_node.name") Signed-off-by: Bitan Biswas Cc: stable Signed-off-by: Srinivas Kandagatla Link: https://lore.kernel.org/r/20200109104017.6249-5-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman commit 6bdd1a0ed11141e4a3574b28200f45e65343f0e6 Author: Samuel Holland Date: Sat Jan 4 19:24:08 2020 -0600 mfd: axp20x: Mark AXP20X_VBUS_IPSOUT_MGMT as volatile commit dc91c3b6fe66a13ac76f6cb3b2100c0779cd3350 upstream. On AXP288 and newer PMICs, bit 7 of AXP20X_VBUS_IPSOUT_MGMT can be set to prevent using the VBUS input. However, when the VBUS unplugged and plugged back in, the bit automatically resets to zero. We need to set the register as volatile to prevent regmap from caching that bit. Otherwise, regcache will think the bit is already set and not write the register. Fixes: cd53216625a0 ("mfd: axp20x: Fix axp288 volatile ranges") Cc: stable@vger.kernel.org Signed-off-by: Samuel Holland Reviewed-by: Chen-Yu Tsai Signed-off-by: Lee Jones Signed-off-by: Greg Kroah-Hartman commit 385c61a41cf9abc504f8ecbe8c41ced850c1e9c1 Author: Tianyu Lan Date: Sat Jan 25 16:50:47 2020 -0500 hv_balloon: Balloon up according to request page number commit d33c240d47dab4fd15123d9e73fc8810cbc6ed6a upstream. Current code has assumption that balloon request memory size aligns with 2MB. But actually Hyper-V doesn't guarantee such alignment. When balloon driver receives non-aligned balloon request, it produces warning and balloon up more memory than requested in order to keep 2MB alignment. Remove the warning and balloon up memory according to actual requested memory size. Fixes: f6712238471a ("hv: hv_balloon: avoid memory leak on alloc_error of 2MB memory block") Cc: stable@vger.kernel.org Reviewed-by: Vitaly Kuznetsov Signed-off-by: Tianyu Lan Reviewed-by: Michael Kelley Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 570a29b1f75bfa9c420a1a69d2a4d03dad278928 Author: Pierre-Louis Bossart Date: Fri Jan 24 15:36:20 2020 -0600 ASoC: SOF: core: free trace on errors commit 37e97e6faeabda405d0c4319f8419dcc3da14b2b upstream. free_trace() is not called on probe errors, fix Reviewed-by: Kai Vehmanen Signed-off-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20200124213625.30186-3-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 84c9efd2f855e538854833510d0fe8bfa3362a65 Author: Michał Mirosław Date: Thu Jan 2 11:42:16 2020 +0100 mmc: sdhci-of-at91: fix memleak on clk_get failure [ Upstream commit a04184ce777b46e92c2b3c93c6dcb2754cb005e1 ] sdhci_alloc_host() does its work not using managed infrastructure, so needs explicit free on error path. Add it where needed. Cc: Fixes: bb5f8ea4d514 ("mmc: sdhci-of-at91: introduce driver for the Atmel SDMMC") Signed-off-by: Michał Mirosław Acked-by: Ludovic Desroches Acked-by: Adrian Hunter Link: https://lore.kernel.org/r/b2a44d5be2e06ff075f32477e466598bb0f07b36.1577961679.git.mirq-linux@rere.qmqm.pl Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 65e6f63ebfb93a55a1e215f2d632a3b63d19c30c Author: Zhihao Cheng Date: Sat Jan 11 17:50:36 2020 +0800 ubifs: Fix deadlock in concurrent bulk-read and writepage commit f5de5b83303e61b1f3fb09bd77ce3ac2d7a475f2 upstream. In ubifs, concurrent execution of writepage and bulk read on the same file may cause ABBA deadlock, for example (Reproduce method see Link): Process A(Bulk-read starts from page4) Process B(write page4 back) vfs_read wb_workfn or fsync ... ... generic_file_buffered_read write_cache_pages ubifs_readpage LOCK(page4) ubifs_bulk_read ubifs_writepage LOCK(ui->ui_mutex) ubifs_write_inode ubifs_do_bulk_read LOCK(ui->ui_mutex) find_or_create_page(alloc page4) ↑ LOCK(page4) <-- ABBA deadlock occurs! In order to ensure the serialization execution of bulk read, we can't remove the big lock 'ui->ui_mutex' in ubifs_bulk_read(). Instead, we allow ubifs_do_bulk_read() to lock page failed by replacing find_or_create_page(FGP_LOCK) with pagecache_get_page(FGP_LOCK | FGP_NOWAIT). Signed-off-by: Zhihao Cheng Suggested-by: zhangyi (F) Cc: Fixes: 4793e7c5e1c ("UBIFS: add bulk-read facility") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206153 Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit e3a561aa5376bdfc14590790fd892e532dbc06a1 Author: Eric Biggers Date: Mon Dec 9 14:23:24 2019 -0800 ubifs: Fix FS_IOC_SETFLAGS unexpectedly clearing encrypt flag commit 2b57067a7778484c10892fa191997bfda29fea13 upstream. UBIFS's implementation of FS_IOC_SETFLAGS fails to preserve existing inode flags that aren't settable by FS_IOC_SETFLAGS, namely the encrypt flag. This causes the encrypt flag to be unexpectedly cleared. Fix it by preserving existing unsettable flags, like ext4 and f2fs do. Test case with kvm-xfstests shell: FSTYP=ubifs KEYCTL_PROG=keyctl . fs/ubifs/config . ~/xfstests/common/encrypt dev=$(__blkdev_to_ubi_volume /dev/vdc) ubiupdatevol -t $dev mount $dev /mnt -t ubifs k=$(_generate_session_encryption_key) mkdir /mnt/edir xfs_io -c "set_encpolicy $k" /mnt/edir echo contents > /mnt/edir/file chattr +i /mnt/edir/file chattr -i /mnt/edir/file With the bug, the following errors occur on the last command: [ 18.081559] fscrypt (ubifs, inode 67): Inconsistent encryption context (parent directory: 65) chattr: Operation not permitted while reading flags on /mnt/edir/file Fixes: d475a507457b ("ubifs: Add skeleton for fscrypto") Cc: # v4.10+ Signed-off-by: Eric Biggers Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 91f96a9cdd82c3e86c49ad3fea90c1eb200267bc Author: Sascha Hauer Date: Wed Dec 4 11:09:58 2019 +0100 ubifs: Fix wrong memory allocation commit edec51374bce779f37fc209a228139c55d90ec8d upstream. In create_default_filesystem() when we allocate the idx node we must use the idx_node_size we calculated just one line before, not tmp, which contains completely other data. Fixes: c4de6d7e4319 ("ubifs: Refactor create_default_filesystem()") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Naga Sureshkumar Relli Tested-by: Naga Sureshkumar Relli Signed-off-by: Sascha Hauer Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 0119c617ebb6ec5399abddfd917f90c49ee9e747 Author: Eric Biggers Date: Mon Jan 20 14:31:59 2020 -0800 ubifs: don't trigger assertion on invalid no-key filename commit f0d07a98a070bb5e443df19c3aa55693cbca9341 upstream. If userspace provides an invalid fscrypt no-key filename which encodes a hash value with any of the UBIFS node type bits set (i.e. the high 3 bits), gracefully report ENOENT rather than triggering ubifs_assert(). Test case with kvm-xfstests shell: . fs/ubifs/config . ~/xfstests/common/encrypt dev=$(__blkdev_to_ubi_volume /dev/vdc) ubiupdatevol $dev -t mount $dev /mnt -t ubifs mkdir /mnt/edir xfs_io -c set_encpolicy /mnt/edir rm /mnt/edir/_,,,,,DAAAAAAAAAAAAAAAAAAAAAAAAAA With the bug, the following assertion fails on the 'rm' command: [ 19.066048] UBIFS error (ubi0:0 pid 379): ubifs_assert_failed: UBIFS assert failed: !(hash & ~UBIFS_S_KEY_HASH_MASK), in fs/ubifs/key.h:170 Fixes: f4f61d2cc6d8 ("ubifs: Implement encrypted filenames") Cc: # v4.10+ Link: https://lore.kernel.org/r/20200120223201.241390-5-ebiggers@kernel.org Signed-off-by: Eric Biggers Signed-off-by: Greg Kroah-Hartman commit 9220bf17ae180722be8673dae93df59c3d1c4e62 Author: Eric Biggers Date: Sun Jan 19 22:07:32 2020 -0800 fscrypt: don't print name of busy file when removing key commit 13a10da94615d81087e718517794f2868a8b3fab upstream. When an encryption key can't be fully removed due to file(s) protected by it still being in-use, we shouldn't really print the path to one of these files to the kernel log, since parts of this path are likely to be encrypted on-disk, and (depending on how the system is set up) the confidentiality of this path might be lost by printing it to the log. This is a trade-off: a single file path often doesn't matter at all, especially if it's a directory; the kernel log might still be protected in some way; and I had originally hoped that any "inode(s) still busy" bugs (which are security weaknesses in their own right) would be quickly fixed and that to do so it would be super helpful to always know the file path and not have to run 'find dir -inum $inum' after the fact. But in practice, these bugs can be hard to fix (e.g. due to asynchronous process killing that is difficult to eliminate, for performance reasons), and also not tied to specific files, so knowing a file path doesn't necessarily help. So to be safe, for now let's just show the inode number, not the path. If someone really wants to know a path they can use 'find -inum'. Fixes: b1c0ec3599f4 ("fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl") Cc: # v5.4+ Link: https://lore.kernel.org/r/20200120060732.390362-1-ebiggers@kernel.org Signed-off-by: Eric Biggers Signed-off-by: Greg Kroah-Hartman commit ad270734193e30670516dd3b2576afcd13a389ef Author: Stephen Boyd Date: Thu Jan 9 07:59:07 2020 -0800 alarmtimer: Unregister wakeup source when module get fails commit 6b6d188aae79a630957aefd88ff5c42af6553ee3 upstream. The alarmtimer_rtc_add_device() function creates a wakeup source and then tries to grab a module reference. If that fails the function returns early with an error code, but fails to remove the wakeup source. Cleanup this exit path so there is no dangling wakeup source, which is named 'alarmtime' left allocated which will conflict with another RTC device that may be registered later. Fixes: 51218298a25e ("alarmtimer: Ensure RTC module is not unloaded") Signed-off-by: Stephen Boyd Signed-off-by: Thomas Gleixner Reviewed-by: Douglas Anderson Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200109155910.907-2-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman commit 05b147599f9d9b791a3b911734af0d944e9a7177 Author: Hans de Goede Date: Tue Dec 10 10:57:52 2019 +0100 ACPI / battery: Deal better with neither design nor full capacity not being reported commit ff3154d1d89a2343fd5f82e65bc0cf1d4e6659b3 upstream. Commit b41901a2cf06 ("ACPI / battery: Do not export energy_full[_design] on devices without full_charge_capacity") added support for some (broken) devices which always report 0 for both design_capacity and full_charge_capacity. Since the device that commit was written as a fix for is not reporting any form of "full" capacity we cannot calculate the value for the POWER_SUPPLY_PROP_CAPACITY, this is worked around by using an alternative array of available properties which does not contain this property. This is necessary because userspace (upower) treats us returning -ENODEV as 0 and then typically will trigger an emergency shutdown because of that. Userspace does not do this if the capacity sysfs attribute is not present at all. There are two potential problems with that commit: 1) It assumes that both full_charge- and design-capacity are broken at the same time and only checks if full_charge- is broken. 2) It assumes that this only ever happens for devices which report energy units rather then charge units. This commit fixes both issues by only using the alternative array of available properties if both full_charge- and design-capacity are broken and by also adding an alternative array of available properties for devices using mA units. Fixes: b41901a2cf06 ("ACPI / battery: Do not export energy_full[_design] on devices without full_charge_capacity") Cc: 4.19+ # 4.19+ Signed-off-by: Hans de Goede Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 7b86d05d1b0208d07c115ecb7da859005181a8cf Author: Hans de Goede Date: Tue Dec 10 10:57:51 2019 +0100 ACPI / battery: Use design-cap for capacity calculations if full-cap is not available commit 5b74d1d16e2f5753fcbdecd6771b2d8370dda414 upstream. The ThunderSoft TS178 tablet's _BIX implementation reports design_capacity but not full_charge_capacity. Before this commit this would cause us to return -ENODEV for the capacity attribute, which userspace does not like. Specifically upower does this: if (sysfs_file_exists (native_path, "capacity")) { percentage = sysfs_get_double (native_path, "capacity"); Where the sysfs_get_double() helper returns 0 when we return -ENODEV, so the battery always reads 0% if we return -ENODEV. This commit fixes this by using the design-capacity instead of the full-charge-capacity when the full-charge-capacity is not available. Fixes: b41901a2cf06 ("ACPI / battery: Do not export energy_full[_design] on devices without full_charge_capacity") Cc: 4.19+ # 4.19+ Signed-off-by: Hans de Goede Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 93bba324c28a7ed67a58d2d6eb114add348e3397 Author: Hans de Goede Date: Tue Dec 10 10:57:50 2019 +0100 ACPI / battery: Deal with design or full capacity being reported as -1 commit cc99f0ad52467028cb1251160f23ad4bb65baf20 upstream. Commit b41901a2cf06 ("ACPI / battery: Do not export energy_full[_design] on devices without full_charge_capacity") added support for some (broken) devices which always report 0 for both design- and full_charge-capacity. This assumes that if the capacity is not being reported it is 0. The ThunderSoft TS178 tablet's _BIX implementation falsifies this assumption. It reports ACPI_BATTERY_VALUE_UNKNOWN (-1) as full_charge_capacity, which we treat as a valid value which causes several problems. This commit fixes this by adding a new ACPI_BATTERY_CAPACITY_VALID() helper which checks that the value is not 0 and not -1; and using this whenever we need to test if either design_capacity or full_charge_capacity is valid. Fixes: b41901a2cf06 ("ACPI / battery: Do not export energy_full[_design] on devices without full_charge_capacity") Cc: 4.19+ # 4.19+ Signed-off-by: Hans de Goede Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 832d6f76f268b10386a768fcad88937c3b0f12e1 Author: Hans de Goede Date: Tue Dec 17 20:08:11 2019 +0100 ACPI: video: Do not export a non working backlight interface on MSI MS-7721 boards commit d21a91629f4b8e794fc4c0e0c17c85cedf1d806c upstream. Despite our heuristics to not wrongly export a non working ACPI backlight interface on desktop machines, we still end up exporting one on desktops using a motherboard from the MSI MS-7721 series. I've looked at improving the heuristics, but in this case a quirk seems to be the only way to solve this. While at it also add a comment to separate the video_detect_force_none entries in the video_detect_dmi_table from other type of entries, as we already do for the other entry types. Cc: All applicable BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1783786 Signed-off-by: Hans de Goede Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 44f6e43924f8f95a292773724a66b0e1010053cb Author: Linus Walleij Date: Wed Dec 4 16:27:49 2019 +0100 mmc: spi: Toggle SPI polarity, do not hardcode it commit af3ed119329cf9690598c5a562d95dfd128e91d6 upstream. The code in mmc_spi_initsequence() tries to send a burst with high chipselect and for this reason hardcodes the device into SPI_CS_HIGH. This is not good because the SPI_CS_HIGH flag indicates logical "asserted" CS not always the physical level. In some cases the signal is inverted in the GPIO library and in that case SPI_CS_HIGH is already set, and enforcing SPI_CS_HIGH again will actually drive it low. Instead of hard-coding this, toggle the polarity so if the default is LOW it goes high to assert chipselect but if it is already high then toggle it low instead. Cc: Phil Elwell Reported-by: Mark Brown Signed-off-by: Linus Walleij Reviewed-by: Mark Brown Link: https://lore.kernel.org/r/20191204152749.12652-1-linus.walleij@linaro.org Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 47bdd025b921a08f146db33715c49fe0c1dba581 Author: Kishon Vijay Abraham I Date: Tue Jan 21 17:27:34 2020 +0530 PCI: keystone: Fix error handling when "num-viewport" DT property is not populated commit b0de922af53eede340986a2d05b6cd4b6d6efa43 upstream. Fix error handling when "num-viewport" DT property is not populated. Fixes: 23284ad677a9 ("PCI: keystone: Add support for PCIe EP in AM654x Platforms") Signed-off-by: Kishon Vijay Abraham I Signed-off-by: Lorenzo Pieralisi Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Greg Kroah-Hartman commit a8b56e222300a9214f77766274dccff2edcfcc00 Author: Yurii Monakov Date: Tue Dec 17 14:38:36 2019 +0300 PCI: keystone: Fix link training retries initiation commit 6df19872d881641e6394f93ef2938cffcbdae5bb upstream. ks_pcie_stop_link() function does not clear LTSSM_EN_VAL bit so link training was not triggered more than once after startup. In configurations where link can be unstable during early boot, for example, under low temperature, it will never be established. Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Signed-off-by: Yurii Monakov Signed-off-by: Lorenzo Pieralisi Acked-by: Andrew Murray Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 6700c0d9ae92bae13b9f79670f0195f35a602b5d Author: Yurii Monakov Date: Fri Oct 4 18:48:11 2019 +0300 PCI: keystone: Fix outbound region mapping commit 2d0c3fbe43fa0e6fcb7a6c755c5f4cd702c0d2f4 upstream. The Keystone outbound Address Translation Unit (ATU) maps PCI MMIO space in 8 MB windows. When programming the ATU windows, we previously incremented the starting address by 8, not 8 MB, so all the windows were mapped to the first 8 MB. Therefore, only 8 MB of MMIO space was accessible. Update the loop so it increments the starting address by 8 MB, not 8, so more MMIO space is accessible. Fixes: e75043ad9792 ("PCI: keystone: Cleanup outbound window configuration") Link: https://lore.kernel.org/r/20191004154811.GA31397@monakov-y.office.kontur-niirs.ru Signed-off-by: Yurii Monakov [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas Signed-off-by: Lorenzo Pieralisi Acked-by: Andrew Murray Acked-by: Kishon Vijay Abraham I Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Greg Kroah-Hartman commit 05d56da81d941d5c20be373e422fb7e709d93c0d Author: David Engraf Date: Mon Dec 16 12:18:25 2019 +0100 PCI: tegra: Fix return value check of pm_runtime_get_sync() commit 885199148442f56b880995d703d2ed03b6481a3c upstream. pm_runtime_get_sync() returns the device's usage counter. This might be >0 if the device is already powered up or CONFIG_PM is disabled. Abort probe function on real error only. Fixes: da76ba50963b ("PCI: tegra: Add power management support") Link: https://lore.kernel.org/r/20191216111825.28136-1-david.engraf@sysgo.com Signed-off-by: David Engraf Signed-off-by: Bjorn Helgaas Signed-off-by: Lorenzo Pieralisi Acked-by: Andrew Murray Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Greg Kroah-Hartman commit 38b67e60b6b582e81f9db1b2e7176cbbfbd3e574 Author: Tom Zanussi Date: Wed Jan 29 21:18:18 2020 -0500 tracing: Fix now invalid var_ref_vals assumption in trace action [ Upstream commit d380dcde9a07ca5de4805dee11f58a98ec0ad6ff ] The patch 'tracing: Fix histogram code when expression has same var as value' added code to return an existing variable reference when creating a new variable reference, which resulted in var_ref_vals slots being reused instead of being duplicated. The implementation of the trace action assumes that the end of the var_ref_vals array starting at action_data.var_ref_idx corresponds to the values that will be assigned to the trace params. The patch mentioned above invalidates that assumption, which means that each param needs to explicitly specify its index into var_ref_vals. This fix changes action_data.var_ref_idx to an array of var ref indexes to account for that. Link: https://lore.kernel.org/r/1580335695.6220.8.camel@kernel.org Fixes: 8bcebc77e85f ("tracing: Fix histogram code when expression has same var as value") Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin commit 5b92f86c84928e9b532124bae09471c69aa54153 Author: Christophe Leroy Date: Mon Jan 27 10:42:04 2020 +0000 powerpc/32s: Fix CPU wake-up from sleep mode commit 9933819099c4600b41a042f27a074470a43cf6b9 upstream. Commit f7354ccac844 ("powerpc/32: Remove CURRENT_THREAD_INFO and rename TI_CPU") broke the CPU wake-up from sleep mode (i.e. when _TLF_SLEEPING is set) by delaying the tovirt(r2, r2). This is because r2 is not restored by fast_exception_return. It used to work (by chance ?) because CPU wake-up interrupt never comes from user, so r2 is expected to point to 'current' on return. Commit e2fb9f544431 ("powerpc/32: Prepare for Kernel Userspace Access Protection") broke it even more by clobbering r0 which is not restored by fast_exception_return either. Use r6 instead of r0. This is possible because r3-r6 are restored by fast_exception_return and only r3-r5 are used for exception arguments. For r2 it could be converted back to virtual address, but stay on the safe side and restore it from the stack instead. It should be live in the cache at that moment, so loading from the stack should make no difference compared to converting it from phys to virt. Fixes: f7354ccac844 ("powerpc/32: Remove CURRENT_THREAD_INFO and rename TI_CPU") Fixes: e2fb9f544431 ("powerpc/32: Prepare for Kernel Userspace Access Protection") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/6d02c3ae6ad77af34392e98117e44c2bf6d13ba1.1580121710.git.christophe.leroy@c-s.fr Signed-off-by: Greg Kroah-Hartman commit 4135a03fdf2eae8cf6bd83b9c9ffb1496d5d5570 Author: Christophe Leroy Date: Fri Jan 24 11:54:40 2020 +0000 powerpc/32s: Fix bad_kuap_fault() commit 6ec20aa2e510b6297906c45f009aa08b2d97269a upstream. At the moment, bad_kuap_fault() reports a fault only if a bad access to userspace occurred while access to userspace was not granted. But if a fault occurs for a write outside the allowed userspace segment(s) that have been unlocked, bad_kuap_fault() fails to detect it and the kernel loops forever in do_page_fault(). Fix it by checking that the accessed address is within the allowed range. Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/f48244e9485ada0a304ed33ccbb8da271180c80d.1579866752.git.christophe.leroy@c-s.fr Signed-off-by: Greg Kroah-Hartman commit 1bd3b871af5718121bfb820b73bb61a80ac1928a Author: Pingfan Liu Date: Fri Jan 10 12:54:02 2020 +0800 powerpc/pseries: Advance pfn if section is not present in lmb_is_removable() commit fbee6ba2dca30d302efe6bddb3a886f5e964a257 upstream. In lmb_is_removable(), if a section is not present, it should continue to test the rest of the sections in the block. But the current code fails to do so. Fixes: 51925fb3c5c9 ("powerpc/pseries: Implement memory hotplug remove in the kernel") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Pingfan Liu Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/1578632042-12415-1-git-send-email-kernelfans@gmail.com Signed-off-by: Greg Kroah-Hartman commit 02c4699fb664004a38ac12b09181df7cddcb39db Author: Sukadev Bhattiprolu Date: Mon Jan 6 13:50:02 2020 -0600 powerpc/xmon: don't access ASDR in VMs commit c2a20711fc181e7f22ee5c16c28cb9578af84729 upstream. ASDR is HV-privileged and must only be accessed in HV-mode. Fixes a Program Check (0x700) when xmon in a VM dumps SPRs. Fixes: d1e1b351f50f ("powerpc/xmon: Add ISA v3.0 SPRs to SPR dump") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Sukadev Bhattiprolu Reviewed-by: Andrew Donnellan Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200107021633.GB29843@us.ibm.com Signed-off-by: Greg Kroah-Hartman commit 796085dbe323d82e511105690bf5d15650bb342e Author: Christophe Leroy Date: Tue Jan 14 08:13:09 2020 +0000 powerpc/ptdump: Fix W+X verification commit d80ae83f1f932ab7af47b54d0d3bef4f4dba489f upstream. Verification cannot rely on simple bit checking because on some platforms PAGE_RW is 0, checking that a page is not W means checking that PAGE_RO is set instead of checking that PAGE_RW is not set. Use pte helpers instead of checking bits. Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/0d894839fdbb19070f0e1e4140363be4f2bb62fc.1578989540.git.christophe.leroy@c-s.fr Signed-off-by: Greg Kroah-Hartman commit 2cabe61ab8ae633c01fe4f7e5001f14e9287538c Author: Aneesh Kumar K.V Date: Mon Feb 3 17:36:46 2020 -0800 powerpc/mmu_gather: enable RCU_TABLE_FREE even for !SMP case commit 12e4d53f3f04e81f9e83d6fc10edc7314ab9f6b9 upstream. Patch series "Fixup page directory freeing", v4. This is a repost of patch series from Peter with the arch specific changes except ppc64 dropped. ppc64 changes are added here because we are redoing the patch series on top of ppc64 changes. This makes it easy to backport these changes. Only the first 2 patches need to be backported to stable. The thing is, on anything SMP, freeing page directories should observe the exact same order as normal page freeing: 1) unhook page/directory 2) TLB invalidate 3) free page/directory Without this, any concurrent page-table walk could end up with a Use-after-Free. This is esp. trivial for anything that has software page-table walkers (HAVE_FAST_GUP / software TLB fill) or the hardware caches partial page-walks (ie. caches page directories). Even on UP this might give issues since mmu_gather is preemptible these days. An interrupt or preempted task accessing user pages might stumble into the free page if the hardware caches page directories. This patch series fixes ppc64 and add generic MMU_GATHER changes to support the conversion of other architectures. I haven't added patches w.r.t other architecture because they are yet to be acked. This patch (of 9): A followup patch is going to make sure we correctly invalidate page walk cache before we free page table pages. In order to keep things simple enable RCU_TABLE_FREE even for !SMP so that we don't have to fixup the !SMP case differently in the followup patch !SMP case is right now broken for radix translation w.r.t page walk cache flush. We can get interrupted in between page table free and that would imply we have page walk cache entries pointing to tables which got freed already. Michael said "both our platforms that run on Power9 force SMP on in Kconfig, so the !SMP case is unlikely to be a problem for anyone in practice, unless they've hacked their kernel to build it !SMP." Link: http://lkml.kernel.org/r/20200116064531.483522-2-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V Acked-by: Peter Zijlstra (Intel) Acked-by: Michael Ellerman Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 63098a93063a52e57b26b37014ae9ac7b8f87a41 Author: Gerald Schaefer Date: Thu Jan 16 19:59:04 2020 +0100 s390/mm: fix dynamic pagetable upgrade for hugetlbfs commit 5f490a520bcb393389a4d44bec90afcb332eb112 upstream. Commit ee71d16d22bb ("s390/mm: make TASK_SIZE independent from the number of page table levels") changed the logic of TASK_SIZE and also removed the arch_mmap_check() implementation for s390. This combination has a subtle effect on how get_unmapped_area() for hugetlbfs pages works. It is now possible that a user process establishes a hugetlbfs mapping at an address above 4 TB, without triggering a dynamic pagetable upgrade from 3 to 4 levels. This is because hugetlbfs mappings will not use mm->get_unmapped_area, but rather file->f_op->get_unmapped_area, which currently is the generic implementation of hugetlb_get_unmapped_area() that does not know about s390 dynamic pagetable upgrades, but with the new definition of TASK_SIZE, it will now allow mappings above 4 TB. Subsequent access to such a mapped address above 4 TB will result in a page fault loop, because the CPU cannot translate such a large address with 3 pagetable levels. The fault handler will try to map in a hugepage at the address, but due to the folded pagetable logic it will end up with creating entries in the 3 level pagetable, possibly overwriting existing mappings, and then it all repeats when the access is retried. Apart from the page fault loop, this can have various nasty effects, e.g. kernel panic from one of the BUG_ON() checks in memory management code, or even data loss if an existing mapping gets overwritten. Fix this by implementing HAVE_ARCH_HUGETLB_UNMAPPED_AREA support for s390, providing an s390 version for hugetlb_get_unmapped_area() with pagetable upgrade support similar to arch_get_unmapped_area(), which will then be used instead of the generic version. Fixes: ee71d16d22bb ("s390/mm: make TASK_SIZE independent from the number of page table levels") Cc: # 4.12+ Signed-off-by: Gerald Schaefer Signed-off-by: Vasily Gorbik Signed-off-by: Greg Kroah-Hartman commit e25f00c69039ea11f21c2a1a6ca7cd3c6a7824e7 Author: Alexander Lobakin Date: Fri Jan 17 17:02:08 2020 +0300 MIPS: boot: fix typo in 'vmlinux.lzma.its' target commit 16202c09577f3d0c533274c0410b7de05fb0d458 upstream. Commit 92b34a976348 ("MIPS: boot: add missing targets for vmlinux.*.its") fixed constant rebuild of *.its files on every make invocation, but due to typo ("lzmo") it made no sense for vmlinux.lzma.its. Fixes: 92b34a976348 ("MIPS: boot: add missing targets for vmlinux.*.its") Cc: # v4.19+ Signed-off-by: Alexander Lobakin [paulburton@kernel.org: s/invokation/invocation/] Signed-off-by: Paul Burton Cc: Ralf Baechle Cc: James Hogan Cc: Masahiro Yamada Cc: Rob Herring Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit bd9abdfd68b2fdf991eba025e222bebf83ad690f Author: Alexander Lobakin Date: Fri Jan 17 17:02:07 2020 +0300 MIPS: fix indentation of the 'RELOCS' message commit a53998802e178451701d59d38e36f551422977ba upstream. quiet_cmd_relocs lacks a whitespace which results in: LD vmlinux SORTEX vmlinux SYSMAP System.map RELOCS vmlinux Building modules, stage 2. MODPOST 64 modules After this patch: LD vmlinux SORTEX vmlinux SYSMAP System.map RELOCS vmlinux Building modules, stage 2. MODPOST 64 modules Typo is present in kernel tree since the introduction of relocatable kernel support in commit e818fac595ab ("MIPS: Generate relocation table when CONFIG_RELOCATABLE"), but the relocation scripts were moved to Makefile.postlink later with commit 44079d3509ae ("MIPS: Use Makefile.postlink to insert relocations into vmlinux"). Fixes: 44079d3509ae ("MIPS: Use Makefile.postlink to insert relocations into vmlinux") Cc: # v4.11+ Signed-off-by: Alexander Lobakin [paulburton@kernel.org: Fixup commit references in commit message.] Signed-off-by: Paul Burton Cc: Ralf Baechle Cc: James Hogan Cc: Masahiro Yamada Cc: Rob Herring Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 6b29d4a1f832a2f2c0bf22ba525b6a97e745dbfb Author: Alexander Lobakin Date: Fri Jan 17 17:02:09 2020 +0300 MIPS: syscalls: fix indentation of the 'SYSNR' message commit 4f29ad200f7b40fbcf73cd65f95087535ba78380 upstream. It also lacks a whitespace (copy'n'paste error?) and also messes up the output: SYSHDR arch/mips/include/generated/uapi/asm/unistd_n32.h SYSHDR arch/mips/include/generated/uapi/asm/unistd_n64.h SYSHDR arch/mips/include/generated/uapi/asm/unistd_o32.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n32.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n64.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_o32.h WRAP arch/mips/include/generated/uapi/asm/bpf_perf_event.h WRAP arch/mips/include/generated/uapi/asm/ipcbuf.h After: SYSHDR arch/mips/include/generated/uapi/asm/unistd_n32.h SYSHDR arch/mips/include/generated/uapi/asm/unistd_n64.h SYSHDR arch/mips/include/generated/uapi/asm/unistd_o32.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n32.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_n64.h SYSNR arch/mips/include/generated/uapi/asm/unistd_nr_o32.h WRAP arch/mips/include/generated/uapi/asm/bpf_perf_event.h WRAP arch/mips/include/generated/uapi/asm/ipcbuf.h Present since day 0 of syscall table generation introduction for MIPS. Fixes: 9bcbf97c6293 ("mips: add system call table generation support") Cc: # v5.0+ Signed-off-by: Alexander Lobakin Signed-off-by: Paul Burton Cc: Ralf Baechle Cc: James Hogan Cc: Masahiro Yamada Cc: Rob Herring Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 6cca9100db9048e09fe34866a8fd82a70236867f Author: Christoffer Dall Date: Thu Dec 12 20:50:55 2019 +0100 KVM: arm64: Only sign-extend MMIO up to register width commit b6ae256afd32f96bec0117175b329d0dd617655e upstream. On AArch64 you can do a sign-extended load to either a 32-bit or 64-bit register, and we should only sign extend the register up to the width of the register as specified in the operation (by using the 32-bit Wn or 64-bit Xn register specifier). As it turns out, the architecture provides this decoding information in the SF ("Sixty-Four" -- how cute...) bit. Let's take advantage of this with the usual 32-bit/64-bit header file dance and do the right thing on AArch64 hosts. Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191212195055.5541-1-christoffer.dall@arm.com Signed-off-by: Greg Kroah-Hartman commit 4dd5c62d2e4c1a69920b02fee1571240af7dc3ef Author: Mark Rutland Date: Wed Jan 8 13:43:24 2020 +0000 KVM: arm/arm64: Correct AArch32 SPSR on exception entry commit 1cfbb484de158e378e8971ac40f3082e53ecca55 upstream. Confusingly, there are three SPSR layouts that a kernel may need to deal with: (1) An AArch64 SPSR_ELx view of an AArch64 pstate (2) An AArch64 SPSR_ELx view of an AArch32 pstate (3) An AArch32 SPSR_* view of an AArch32 pstate When the KVM AArch32 support code deals with SPSR_{EL2,HYP}, it's either dealing with #2 or #3 consistently. On arm64 the PSR_AA32_* definitions match the AArch64 SPSR_ELx view, and on arm the PSR_AA32_* definitions match the AArch32 SPSR_* view. However, when we inject an exception into an AArch32 guest, we have to synthesize the AArch32 SPSR_* that the guest will see. Thus, an AArch64 host needs to synthesize layout #3 from layout #2. This patch adds a new host_spsr_to_spsr32() helper for this, and makes use of it in the KVM AArch32 support code. For arm64 we need to shuffle the DIT bit around, and remove the SS bit, while for arm we can use the value as-is. I've open-coded the bit manipulation for now to avoid having to rework the existing PSR_* definitions into PSR64_AA32_* and PSR32_AA32_* definitions. I hope to perform a more thorough refactoring in future so that we can handle pstate view manipulation more consistently across the kernel tree. Signed-off-by: Mark Rutland Signed-off-by: Marc Zyngier Reviewed-by: Alexandru Elisei Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200108134324.46500-4-mark.rutland@arm.com Signed-off-by: Greg Kroah-Hartman commit b0e01e9d23530dc26e6cb18736a8bd3e20949995 Author: Mark Rutland Date: Wed Jan 8 13:43:23 2020 +0000 KVM: arm/arm64: Correct CPSR on exception entry commit 3c2483f15499b877ccb53250d88addb8c91da147 upstream. When KVM injects an exception into a guest, it generates the CPSR value from scratch, configuring CPSR.{M,A,I,T,E}, and setting all other bits to zero. This isn't correct, as the architecture specifies that some CPSR bits are (conditionally) cleared or set upon an exception, and others are unchanged from the original context. This patch adds logic to match the architectural behaviour. To make this simple to follow/audit/extend, documentation references are provided, and bits are configured in order of their layout in SPSR_EL2. This layout can be seen in the diagram on ARM DDI 0487E.a page C5-426. Note that this code is used by both arm and arm64, and is intended to fuction with the SPSR_EL2 and SPSR_HYP layouts. Signed-off-by: Mark Rutland Signed-off-by: Marc Zyngier Reviewed-by: Alexandru Elisei Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200108134324.46500-3-mark.rutland@arm.com Signed-off-by: Greg Kroah-Hartman commit cc7931dc766fac756c95b36b59484abdae87fd31 Author: Mark Rutland Date: Wed Jan 8 13:43:22 2020 +0000 KVM: arm64: Correct PSTATE on exception entry commit a425372e733177eb0779748956bc16c85167af48 upstream. When KVM injects an exception into a guest, it generates the PSTATE value from scratch, configuring PSTATE.{M[4:0],DAIF}, and setting all other bits to zero. This isn't correct, as the architecture specifies that some PSTATE bits are (conditionally) cleared or set upon an exception, and others are unchanged from the original context. This patch adds logic to match the architectural behaviour. To make this simple to follow/audit/extend, documentation references are provided, and bits are configured in order of their layout in SPSR_EL2. This layout can be seen in the diagram on ARM DDI 0487E.a page C5-429. Signed-off-by: Mark Rutland Signed-off-by: Marc Zyngier Reviewed-by: Alexandru Elisei Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200108134324.46500-2-mark.rutland@arm.com Signed-off-by: Greg Kroah-Hartman commit 5222ded5c72ce7b173c66d3e3fb2af6bea511367 Author: Mark Rutland Date: Wed Jan 22 12:45:46 2020 +0000 arm64: acpi: fix DAIF manipulation with pNMI commit e533dbe9dcb199bb637a2c465f3a6e70564994fe upstream. Since commit: d44f1b8dd7e66d80 ("arm64: KVM/mm: Move SEA handling behind a single 'claim' interface") ... the top-level APEI SEA handler has the shape: 1. current_flags = arch_local_save_flags() 2. local_daif_restore(DAIF_ERRCTX) 3. 4. local_daif_restore(current_flags) However, since commit: 4a503217ce37e1f4 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") ... when pseudo-NMIs (pNMIs) are in use, arch_local_save_flags() will save the PMR value rather than the DAIF flags. The combination of these two commits means that the APEI SEA handler will erroneously attempt to restore the PMR value into DAIF. Fix this by factoring local_daif_save_flags() out of local_daif_save(), so that we can consistently save DAIF in step #1, regardless of whether pNMIs are in use. Both commits were introduced concurrently in v5.0. Cc: Fixes: 4a503217ce37e1f4 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Fixes: d44f1b8dd7e66d80 ("arm64: KVM/mm: Move SEA handling behind a single 'claim' interface") Signed-off-by: Mark Rutland Cc: Catalin Marinas Cc: James Morse Cc: Julien Thierry Cc: Will Deacon Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 79c56db06547d9f853f81d2f12b5a11a96cf436c Author: Yong Zhi Date: Fri Jan 31 14:40:03 2020 -0600 ALSA: hda: Add JasperLake PCI ID and codec vid commit 78be2228c15dd45865b102b29d72e721f0ace9b1 upstream. Add HD Audio Device PCI ID and codec vendor_id for the Intel JasperLake REV2/A0 silicon. Signed-off-by: Yong Zhi Signed-off-by: Pierre-Louis Bossart Cc: Link: https://lore.kernel.org/r/20200131204003.10153-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 3d938d9febd12902cac71ad6c8c800aff8b28d22 Author: Hans de Goede Date: Sat Jan 25 19:10:21 2020 +0100 ALSA: hda: Add Clevo W65_67SB the power_save blacklist commit d8feb6080bb0c9f4d799a423d9453048fdd06990 upstream. Using HDA power-saving on the Clevo W65_67SB causes the first 0.5 seconds of audio to be missing every time audio starts playing. This commit adds the Clevo W65_67SB the power_save blacklist to avoid this issue. Cc: stable@vger.kernel.org BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1525104 Signed-off-by: Hans de Goede Link: https://lore.kernel.org/r/20200125181021.70446-1-hdegoede@redhat.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 6cb7581f5702c1f8e0f463338f1337a1858ec0bd Author: Takashi Iwai Date: Mon Jan 20 11:41:27 2020 +0100 ALSA: hda: Apply aligned MMIO access only conditionally commit 4d024fe8f806e20e577cc934204c5784c7063293 upstream. It turned out that the recent simplification of HD-audio bus access helpers caused a regression on the virtual HD-audio device on QEMU with ARM platforms. The driver got a CORB/RIRB timeout and couldn't probe any codecs. The essential difference that caused a problem was the enforced aligned MMIO accesses by simplification. Since snd-hda-tegra driver is enabled on ARM, it enables CONFIG_SND_HDA_ALIGNED_MMIO, which makes the all HD-audio drivers using the aligned MMIO accesses. While this is mandatory for snd-hda-tegra, it seems that snd-hda-intel on ARM gets broken by this access pattern. For addressing the regression, this patch introduces a new flag, aligned_mmio, to hdac_bus object, and applies the aligned MMIO only when this flag is set. This change affects only platforms with CONFIG_SND_HDA_ALIGNED_MMIO set, i.e. mostly only for ARM platforms. Unfortunately the patch became a big bigger than it should be, just because the former calls didn't take hdac_bus object in the argument, hence we had to extend the call patterns. Fixes: 19abfefd4c76 ("ALSA: hda: Direct MMIO accesses") BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1161152 Cc: Link: https://lore.kernel.org/r/20200120104127.28985-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 68efc422c5472b68f071904d9848c999a0c33f3f Author: Mika Westerberg Date: Wed Jan 22 19:28:04 2020 +0300 platform/x86: intel_scu_ipc: Fix interrupt support commit e48b72a568bbd641c91dad354138d3c17d03ee6f upstream. Currently the driver has disabled interrupt support for Tangier but actually interrupt works just fine if the command is not written twice in a row. Also we need to ack the interrupt in the handler. Signed-off-by: Mika Westerberg Reviewed-by: Andy Shevchenko Cc: stable@vger.kernel.org Signed-off-by: Andy Shevchenko Signed-off-by: Greg Kroah-Hartman commit 5bf25f3828a292a003d52cdc04dcf5ce0fc0a8b5 Author: Pawan Gupta Date: Fri Jan 10 14:50:54 2020 -0800 x86/cpu: Update cached HLE state on write to TSX_CTRL_CPUID_CLEAR commit 5efc6fa9044c3356d6046c6e1da6d02572dbed6b upstream. /proc/cpuinfo currently reports Hardware Lock Elision (HLE) feature to be present on boot cpu even if it was disabled during the bootup. This is because cpuinfo_x86->x86_capability HLE bit is not updated after TSX state is changed via the new MSR IA32_TSX_CTRL. Update the cached HLE bit also since it is expected to change after an update to CPUID_CLEAR bit in MSR IA32_TSX_CTRL. Fixes: 95c5824f75f3 ("x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default") Signed-off-by: Pawan Gupta Signed-off-by: Thomas Gleixner Tested-by: Neelima Krishnan Reviewed-by: Dave Hansen Reviewed-by: Josh Poimboeuf Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2529b99546294c893dfa1c89e2b3e46da3369a59.1578685425.git.pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 146f086a409ba488d5cada8f38806809a5a9834a Author: Kevin Hao Date: Mon Jan 20 12:35:47 2020 +0800 irqdomain: Fix a memory leak in irq_domain_push_irq() commit 0f394daef89b38d58c91118a2b08b8a1b316703b upstream. Fix a memory leak reported by kmemleak: unreferenced object 0xffff000bc6f50e80 (size 128): comm "kworker/23:2", pid 201, jiffies 4294894947 (age 942.132s) hex dump (first 32 bytes): 00 00 00 00 41 00 00 00 86 c0 03 00 00 00 00 00 ....A........... 00 a0 b2 c6 0b 00 ff ff 40 51 fd 10 00 80 ff ff ........@Q...... backtrace: [<00000000e62d2240>] kmem_cache_alloc_trace+0x1a4/0x320 [<00000000279143c9>] irq_domain_push_irq+0x7c/0x188 [<00000000d9f4c154>] thunderx_gpio_probe+0x3ac/0x438 [<00000000fd09ec22>] pci_device_probe+0xe4/0x198 [<00000000d43eca75>] really_probe+0xdc/0x320 [<00000000d3ebab09>] driver_probe_device+0x5c/0xf0 [<000000005b3ecaa0>] __device_attach_driver+0x88/0xc0 [<000000004e5915f5>] bus_for_each_drv+0x7c/0xc8 [<0000000079d4db41>] __device_attach+0xe4/0x140 [<00000000883bbda9>] device_initial_probe+0x18/0x20 [<000000003be59ef6>] bus_probe_device+0x98/0xa0 [<0000000039b03d3f>] deferred_probe_work_func+0x74/0xa8 [<00000000870934ce>] process_one_work+0x1c8/0x470 [<00000000e3cce570>] worker_thread+0x1f8/0x428 [<000000005d64975e>] kthread+0xfc/0x128 [<00000000f0eaa764>] ret_from_fork+0x10/0x18 Fixes: 495c38d3001f ("irqdomain: Add irq_domain_{push,pop}_irq() functions") Signed-off-by: Kevin Hao Signed-off-by: Marc Zyngier Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200120043547.22271-1-haokexin@gmail.com Signed-off-by: Greg Kroah-Hartman commit db165906cad5d556ee7bbd78d69dc06dc44030e2 Author: Gustavo A. R. Silva Date: Thu Jan 30 22:13:51 2020 -0800 lib/test_kasan.c: fix memory leak in kmalloc_oob_krealloc_more() commit 3e21d9a501bf99aee2e5835d7f34d8c823f115b5 upstream. In case memory resources for _ptr2_ were allocated, release them before return. Notice that in case _ptr1_ happens to be NULL, krealloc() behaves exactly like kmalloc(). Addresses-Coverity-ID: 1490594 ("Resource leak") Link: http://lkml.kernel.org/r/20200123160115.GA4202@embeddedor Fixes: 3f15801cdc23 ("lib: add kasan test module") Signed-off-by: Gustavo A. R. Silva Reviewed-by: Dmitry Vyukov Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 9cbcbfc67b256a33bc063764407b23946f0ee427 Author: Helen Koike Date: Tue Dec 17 21:00:22 2019 +0100 media: v4l2-rect.h: fix v4l2_rect_map_inside() top/left adjustments commit f51e50db4c20d46930b33be3f208851265694f3e upstream. boundary->width and boundary->height are sizes relative to boundary->left and boundary->top coordinates, but they were not being taken into consideration to adjust r->left and r->top, leading to the following error: Consider the follow as initial values for boundary and r: struct v4l2_rect boundary = { .left = 100, .top = 100, .width = 800, .height = 600, } struct v4l2_rect r = { .left = 0, .top = 0, .width = 1920, .height = 960, } calling v4l2_rect_map_inside(&r, &boundary) was modifying r to: r = { .left = 0, .top = 0, .width = 800, .height = 600, } Which is wrongly outside the boundary rectangle, because: v4l2_rect_set_max_size(r, boundary); // r->width = 800, r->height = 600 ... if (r->left + r->width > boundary->width) // true r->left = boundary->width - r->width; // r->left = 800 - 800 if (r->top + r->height > boundary->height) // true r->top = boundary->height - r->height; // r->height = 600 - 600 Fix this by considering top/left coordinates from boundary. Fixes: ac49de8c49d7 ("[media] v4l2-rect.h: new header with struct v4l2_rect helper functions") Signed-off-by: Helen Koike Cc: # for v4.7 and up Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 965ccdedf1bfdea9f41509d1b63969432976337a Author: Arnd Bergmann Date: Mon Dec 16 15:15:01 2019 +0100 media: v4l2-core: compat: ignore native command codes commit 4a873f3fa5d6ca52e446d306dd7194dd86a09422 upstream. The do_video_ioctl() compat handler converts the compat command codes into the native ones before processing further, but this causes problems for 32-bit user applications that pass a command code that matches a 64-bit native number, which will then be handled the same way. Specifically, this breaks VIDIOC_DQEVENT_TIME from user space applications with 64-bit time_t, as the structure layout is the same as the native 64-bit layout on many architectures (x86 being the notable exception). Change the handler to use the converted command code only for passing into the native ioctl handler, not for deciding on the conversion, in order to make the compat behavior match the native behavior. Actual support for the 64-bit time_t version of VIDIOC_DQEVENT_TIME and other commands still needs to be added in a separate patch. Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit d2db1cbc751f58587070e87f287ffced47945f5b Author: John Hubbard Date: Thu Jan 30 22:12:50 2020 -0800 media/v4l2-core: set pages dirty upon releasing DMA buffers commit 3c7470b6f68434acae459482ab920d1e3fabd1c7 upstream. After DMA is complete, and the device and CPU caches are synchronized, it's still required to mark the CPU pages as dirty, if the data was coming from the device. However, this driver was just issuing a bare put_page() call, without any set_page_dirty*() call. Fix the problem, by calling set_page_dirty_lock() if the CPU pages were potentially receiving data from the device. Link: http://lkml.kernel.org/r/20200107224558.2362728-11-jhubbard@nvidia.com Signed-off-by: John Hubbard Reviewed-by: Christoph Hellwig Acked-by: Hans Verkuil Cc: Mauro Carvalho Chehab Cc: Cc: Alex Williamson Cc: Aneesh Kumar K.V Cc: Björn Töpel Cc: Daniel Vetter Cc: Dan Williams Cc: Ira Weiny Cc: Jan Kara Cc: Jason Gunthorpe Cc: Jason Gunthorpe Cc: Jens Axboe Cc: Jerome Glisse Cc: Jonathan Corbet Cc: Kirill A. Shutemov Cc: Leon Romanovsky Cc: Mike Rapoport Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit d364e9b37cc981eccd5301d4ceadd1931efff376 Author: Yang Shi Date: Thu Jan 30 22:11:24 2020 -0800 mm: move_pages: report the number of non-attempted pages commit 5984fabb6e82d9ab4e6305cb99694c85d46de8ae upstream. Since commit a49bd4d71637 ("mm, numa: rework do_pages_move"), the semantic of move_pages() has changed to return the number of non-migrated pages if they were result of a non-fatal reasons (usually a busy page). This was an unintentional change that hasn't been noticed except for LTP tests which checked for the documented behavior. There are two ways to go around this change. We can even get back to the original behavior and return -EAGAIN whenever migrate_pages is not able to migrate pages due to non-fatal reasons. Another option would be to simply continue with the changed semantic and extend move_pages documentation to clarify that -errno is returned on an invalid input or when migration simply cannot succeed (e.g. -ENOMEM, -EBUSY) or the number of pages that couldn't have been migrated due to ephemeral reasons (e.g. page is pinned or locked for other reasons). This patch implements the second option because this behavior is in place for some time without anybody complaining and possibly new users depending on it. Also it allows to have a slightly easier error handling as the caller knows that it is worth to retry when err > 0. But since the new semantic would be aborted immediately if migration is failed due to ephemeral reasons, need include the number of non-attempted pages in the return value too. Link: http://lkml.kernel.org/r/1580160527-109104-1-git-send-email-yang.shi@linux.alibaba.com Fixes: a49bd4d71637 ("mm, numa: rework do_pages_move") Signed-off-by: Yang Shi Suggested-by: Michal Hocko Acked-by: Michal Hocko Reviewed-by: Wei Yang Cc: [4.17+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 95419e7ef2660897928aaa61d1d5d395dc7bce95 Author: Wei Yang Date: Thu Jan 30 22:11:20 2020 -0800 mm: thp: don't need care deferred split queue in memcg charge move path commit fac0516b5534897bf4c4a88daa06a8cfa5611b23 upstream. If compound is true, this means it is a PMD mapped THP. Which implies the page is not linked to any defer list. So the first code chunk will not be executed. Also with this reason, it would not be proper to add this page to a defer list. So the second code chunk is not correct. Based on this, we should remove the defer list related code. [yang.shi@linux.alibaba.com: better patch title] Link: http://lkml.kernel.org/r/20200117233836.3434-1-richardw.yang@linux.intel.com Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware") Signed-off-by: Wei Yang Suggested-by: Kirill A. Shutemov Acked-by: Yang Shi Cc: David Rientjes Cc: Michal Hocko Cc: Kirill A. Shutemov Cc: Johannes Weiner Cc: Vladimir Davydov Cc: [5.4+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit aab4189dfdb1ea7f13bb2fee0c9990f2225aebc5 Author: Dan Williams Date: Thu Jan 30 22:11:17 2020 -0800 mm/memory_hotplug: fix remove_memory() lockdep splat commit f1037ec0cc8ac1a450974ad9754e991f72884f48 upstream. The daxctl unit test for the dax_kmem driver currently triggers the (false positive) lockdep splat below. It results from the fact that remove_memory_block_devices() is invoked under the mem_hotplug_lock() causing lockdep entanglements with cpu_hotplug_lock() and sysfs (kernfs active state tracking). It is a false positive because the sysfs attribute path triggering the memory remove is not the same attribute path associated with memory-block device. sysfs_break_active_protection() is not applicable since there is no real deadlock conflict, instead move memory-block device removal outside the lock. The mem_hotplug_lock() is not needed to synchronize the memory-block device removal vs the page online state, that is already handled by lock_device_hotplug(). Specifically, lock_device_hotplug() is sufficient to allow try_remove_memory() to check the offline state of the memblocks and be assured that any in progress online attempts are flushed / blocked by kernfs_drain() / attribute removal. The add_memory() path safely creates memblock devices under the mem_hotplug_lock(). There is no kernfs active state synchronization in the memblock device_register() path, so nothing to fix there. This change is only possible thanks to the recent change that refactored memory block device removal out of arch_remove_memory() (commit 4c4b7f9ba948 "mm/memory_hotplug: remove memory block devices before arch_remove_memory()"), and David's due diligence tracking down the guarantees afforded by kernfs_drain(). Not flagged for -stable since this only impacts ongoing development and lockdep validation, not a runtime issue. ====================================================== WARNING: possible circular locking dependency detected 5.5.0-rc3+ #230 Tainted: G OE ------------------------------------------------------ lt-daxctl/6459 is trying to acquire lock: ffff99c7f0003510 (kn->count#241){++++}, at: kernfs_remove_by_name_ns+0x41/0x80 but task is already holding lock: ffffffffa76a5450 (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x20/0xe0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (mem_hotplug_lock.rw_sem){++++}: __lock_acquire+0x39c/0x790 lock_acquire+0xa2/0x1b0 get_online_mems+0x3e/0xb0 kmem_cache_create_usercopy+0x2e/0x260 kmem_cache_create+0x12/0x20 ptlock_cache_init+0x20/0x28 start_kernel+0x243/0x547 secondary_startup_64+0xb6/0xc0 -> #1 (cpu_hotplug_lock.rw_sem){++++}: __lock_acquire+0x39c/0x790 lock_acquire+0xa2/0x1b0 cpus_read_lock+0x3e/0xb0 online_pages+0x37/0x300 memory_subsys_online+0x17d/0x1c0 device_online+0x60/0x80 state_store+0x65/0xd0 kernfs_fop_write+0xcf/0x1c0 vfs_write+0xdb/0x1d0 ksys_write+0x65/0xe0 do_syscall_64+0x5c/0xa0 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (kn->count#241){++++}: check_prev_add+0x98/0xa40 validate_chain+0x576/0x860 __lock_acquire+0x39c/0x790 lock_acquire+0xa2/0x1b0 __kernfs_remove+0x25f/0x2e0 kernfs_remove_by_name_ns+0x41/0x80 remove_files.isra.0+0x30/0x70 sysfs_remove_group+0x3d/0x80 sysfs_remove_groups+0x29/0x40 device_remove_attrs+0x39/0x70 device_del+0x16a/0x3f0 device_unregister+0x16/0x60 remove_memory_block_devices+0x82/0xb0 try_remove_memory+0xb5/0x130 remove_memory+0x26/0x40 dev_dax_kmem_remove+0x44/0x6a [kmem] device_release_driver_internal+0xe4/0x1c0 unbind_store+0xef/0x120 kernfs_fop_write+0xcf/0x1c0 vfs_write+0xdb/0x1d0 ksys_write+0x65/0xe0 do_syscall_64+0x5c/0xa0 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: kn->count#241 --> cpu_hotplug_lock.rw_sem --> mem_hotplug_lock.rw_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(mem_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); lock(mem_hotplug_lock.rw_sem); lock(kn->count#241); *** DEADLOCK *** No fixes tag as this has been a long standing issue that predated the addition of kernfs lockdep annotations. Link: http://lkml.kernel.org/r/157991441887.2763922.4770790047389427325.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams Acked-by: Michal Hocko Reviewed-by: David Hildenbrand Cc: Vishal Verma Cc: Pavel Tatashin Cc: Dave Hansen Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit cb33e477a50b38ce2c2a66763e98b1a27e4a98a1 Author: Amir Goldstein Date: Sun Nov 24 21:31:45 2019 +0200 utimes: Clamp the timestamps in notify_change() commit eb31e2f63d85d1bec4f7b136f317e03c03db5503 upstream. Push clamping timestamps into notify_change(), so in-kernel callers like nfsd and overlayfs will get similar timestamp set behavior as utimes. AV: get rid of clamping in ->setattr() instances; we don't need to bother with that there, with notify_change() doing normalization in all cases now (it already did for implicit case, since current_time() clamps). Suggested-by: Miklos Szeredi Fixes: 42e729b9ddbb ("utimes: Clamp the timestamps before update") Cc: stable@vger.kernel.org # v5.4 Cc: Deepa Dinamani Cc: Jeff Layton Signed-off-by: Amir Goldstein Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 73031a617ac3e6157e117aeb33225debddc3690b Author: zhengbin Date: Fri Oct 4 17:44:20 2019 +0800 mmc: sdhci-pci: Make function amd_sdhci_reset static commit 38413ce39a4bd908c02257cd2f9e0c92b27886f4 upstream. Fix sparse warnings: drivers/mmc/host/sdhci-pci-core.c:1599:6: warning: symbol 'amd_sdhci_reset' was not declared. Should it be static? Reported-by: Hulk Robot Signed-off-by: zhengbin Acked-by: Adrian Hunter Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit af823232b0187b88d39575a76c5a59dbc746fc89 Author: Pingfan Liu Date: Thu Jan 30 22:11:10 2020 -0800 mm/sparse.c: reset section's mem_map when fully deactivated commit 1f503443e7df8dc8366608b4d810ce2d6669827c upstream. After commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug"), when a mem section is fully deactivated, section_mem_map still records the section's start pfn, which is not used any more and will be reassigned during re-addition. In analogy with alloc/free pattern, it is better to clear all fields of section_mem_map. Beside this, it breaks the user space tool "makedumpfile" [1], which makes assumption that a hot-removed section has mem_map as NULL, instead of checking directly against SECTION_MARKED_PRESENT bit. (makedumpfile will be better to change the assumption, and need a patch) The bug can be reproduced on IBM POWERVM by "drmgr -c mem -r -q 5" , trigger a crash, and save vmcore by makedumpfile [1]: makedumpfile, commit e73016540293 ("[v1.6.7] Update version") Link: http://lkml.kernel.org/r/1579487594-28889-1-git-send-email-kernelfans@gmail.com Signed-off-by: Pingfan Liu Acked-by: Michal Hocko Acked-by: David Hildenbrand Cc: Dan Williams Cc: Oscar Salvador Cc: Baoquan He Cc: Qian Cai Cc: Kazuhito Hagio Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c2c814fc9aee7daf696c328045b2ed29f44a391d Author: Theodore Ts'o Date: Thu Jan 30 22:11:04 2020 -0800 memcg: fix a crash in wb_workfn when a device disappears commit 68f23b89067fdf187763e75a56087550624fdbee upstream. Without memcg, there is a one-to-one mapping between the bdi and bdi_writeback structures. In this world, things are fairly straightforward; the first thing bdi_unregister() does is to shutdown the bdi_writeback structure (or wb), and part of that writeback ensures that no other work queued against the wb, and that the wb is fully drained. With memcg, however, there is a one-to-many relationship between the bdi and bdi_writeback structures; that is, there are multiple wb objects which can all point to a single bdi. There is a refcount which prevents the bdi object from being released (and hence, unregistered). So in theory, the bdi_unregister() *should* only get called once its refcount goes to zero (bdi_put will drop the refcount, and when it is zero, release_bdi gets called, which calls bdi_unregister). Unfortunately, del_gendisk() in block/gen_hd.c never got the memo about the Brave New memcg World, and calls bdi_unregister directly. It does this without informing the file system, or the memcg code, or anything else. This causes the root wb associated with the bdi to be unregistered, but none of the memcg-specific wb's are shutdown. So when one of these wb's are woken up to do delayed work, they try to dereference their wb->bdi->dev to fetch the device name, but unfortunately bdi->dev is now NULL, thanks to the bdi_unregister() called by del_gendisk(). As a result, *boom*. Fortunately, it looks like the rest of the writeback path is perfectly happy with bdi->dev and bdi->owner being NULL, so the simplest fix is to create a bdi_dev_name() function which can handle bdi->dev being NULL. This also allows us to bulletproof the writeback tracepoints to prevent them from dereferencing a NULL pointer and crashing the kernel if one is tracing with memcg's enabled, and an iSCSI device dies or a USB storage stick is pulled. The most common way of triggering this will be hotremoval of a device while writeback with memcg enabled is going on. It was triggering several times a day in a heavily loaded production environment. Google Bug Id: 145475544 Link: https://lore.kernel.org/r/20191227194829.150110-1-tytso@mit.edu Link: http://lkml.kernel.org/r/20191228005211.163952-1-tytso@mit.edu Signed-off-by: Theodore Ts'o Cc: Chris Mason Cc: Tejun Heo Cc: Jens Axboe Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 39fac95159b4fe22af18b82c80c216f07ba730bf Author: Takashi Iwai Date: Sat Feb 1 09:05:30 2020 +0100 ALSA: dummy: Fix PCM format loop in proc output commit 2acf25f13ebe8beb40e97a1bbe76f36277c64f1e upstream. The loop termination for iterating over all formats should contain SNDRV_PCM_FORMAT_LAST, not less than it. Fixes: 9b151fec139d ("ALSA: dummy - Add debug proc file") Cc: Link: https://lore.kernel.org/r/20200201080530.22390-3-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 6edf790e9e51a1614ab19af12cea933060201ca3 Author: Takashi Iwai Date: Sat Feb 1 09:05:29 2020 +0100 ALSA: usb-audio: Annotate endianess in Scarlett gen2 quirk commit d8f489355cff55b30731354317739a00cf1238bd upstream. The Scarlett gen2 mixer quirk code defines a few record types to communicate via USB hub, and those must be all little-endian. This patch changes the field types to LE to annotate endianess properly. It also fixes the incorrect usage of leXX_to_cpu() in a couple of places, which was caught by sparse after this change. Fixes: 9e4d5c1be21f ("ALSA: usb-audio: Scarlett Gen 2 mixer interface") Cc: Link: https://lore.kernel.org/r/20200201080530.22390-2-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 85dbab63b46117329e4918fbd09c6035f65e72e3 Author: Takashi Iwai Date: Sat Feb 1 09:05:28 2020 +0100 ALSA: usb-audio: Fix endianess in descriptor validation commit f8e5f90b3a53bb75f05124ed19156388379a337d upstream. I overlooked that some fields are words and need the converts from LE in the recently added USB descriptor validation code. This patch fixes those with the proper macro usages. Fixes: 57f8770620e9 ("ALSA: usb-audio: More validations of descriptor units") Cc: Link: https://lore.kernel.org/r/20200201080530.22390-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 2068fbb20b9f9e53c9b98bfc8410fd9878780799 Author: Bryan O'Donoghue Date: Thu Jan 9 13:17:22 2020 +0000 usb: gadget: f_ecm: Use atomic_t to track in-flight request commit d710562e01c48d59be3f60d58b7a85958b39aeda upstream. Currently ecm->notify_req is used to flag when a request is in-flight. ecm->notify_req is set to NULL and when a request completes it is subsequently reset. This is fundamentally buggy in that the unbind logic of the ECM driver will unconditionally free ecm->notify_req leading to a NULL pointer dereference. Fixes: da741b8c56d6 ("usb ethernet gadget: split CDC Ethernet function") Cc: stable Signed-off-by: Bryan O'Donoghue Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit a7d00597e0b7a140a9c7c3d9fdd289d64a5f8562 Author: Bryan O'Donoghue Date: Thu Jan 9 13:17:21 2020 +0000 usb: gadget: f_ncm: Use atomic_t to track in-flight request commit 5b24c28cfe136597dc3913e1c00b119307a20c7e upstream. Currently ncm->notify_req is used to flag when a request is in-flight. ncm->notify_req is set to NULL and when a request completes it is subsequently reset. This is fundamentally buggy in that the unbind logic of the NCM driver will unconditionally free ncm->notify_req leading to a NULL pointer dereference. Fixes: 40d133d7f542 ("usb: gadget: f_ncm: convert to new function interface with backward compatibility") Cc: stable Signed-off-by: Bryan O'Donoghue Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 683b53b5aa12693976b39089c6828a735f1d0fd2 Author: Roger Quadros Date: Mon Dec 23 08:47:35 2019 +0200 usb: gadget: legacy: set max_speed to super-speed commit 463f67aec2837f981b0a0ce8617721ff59685c00 upstream. These interfaces do support super-speed so let's not limit maximum speed to high-speed. Cc: Signed-off-by: Roger Quadros Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 063daad1412edb31476a6305a9f0112cd53ed0f2 Author: Peter Chen Date: Thu Dec 12 16:35:03 2019 +0800 usb: gadget: f_fs: set req->num_sgs as 0 for non-sg transfer commit d2450c6937018d40d4111fe830fa48d4ddceb8d0 upstream. The UDC core uses req->num_sgs to judge if scatter buffer list is used. Eg: usb_gadget_map_request_by_dev. For f_fs sync io mode, the request is re-used for each request, so if the 1st request->length > PAGE_SIZE, and the 2nd request->length is <= PAGE_SIZE, the f_fs uses the 1st req->num_sgs for the 2nd request, it causes the UDC core get the wrong req->num_sgs value (The 2nd request doesn't use sg). For f_fs async io mode, it is not harm to initialize req->num_sgs as 0 either, in case, the UDC driver doesn't zeroed request structure. Cc: Jun Li Cc: stable Fixes: 772a7a724f69 ("usb: gadget: f_fs: Allow scatter-gather buffers") Signed-off-by: Peter Chen Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 47dbff7950499f85a635b202b9935833f8df87eb Author: Olof Johansson Date: Mon Jan 20 12:14:07 2020 -0600 objtool: Silence build output commit 6ec14aa7a58a1c2fb303692f8cb1ff82d9abd10a upstream. The sync-check.sh script prints out the path due to a "cd -" at the end of the script, even on silent builds. This isn't even needed, since the script is executed in our build instead of sourced (so it won't change the working directory of the surrounding build anyway). Just remove the cd to make the build silent. Fixes: 2ffd84ae973b ("objtool: Update sync-check.sh from perf's check-headers.sh") Signed-off-by: Olof Johansson Signed-off-by: Josh Poimboeuf Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/cb002857fafa8186cfb9c3e43fb62e4108a1bab9.1579543924.git.jpoimboe@redhat.com Signed-off-by: Greg Kroah-Hartman commit 72a533fc29e1a3fa17a7d20f888cdd6eb48cf3b7 Author: Jun Li Date: Mon Jan 20 06:43:19 2020 +0000 usb: typec: tcpci: mask event interrupts when remove driver commit 3ba76256fc4e2a0d7fb26cc95459041ea0e88972 upstream. This is to prevent any possible events generated while unregister tpcm port. Fixes: 74e656d6b055 ("staging: typec: Type-C Port Controller Interface driver (tcpci)") Signed-off-by: Li Jun Reviewed-by: Heikki Krogerus Reviewed-by: Guenter Roeck Link: https://lore.kernel.org/r/1579502333-4145-1-git-send-email-jun.li@nxp.com Signed-off-by: Greg Kroah-Hartman commit 91cfedb762bd967811d7453fb822907db4b58bb8 Author: Thinh Nguyen Date: Wed Dec 18 18:14:50 2019 -0800 usb: dwc3: gadget: Delay starting transfer commit da10bcdd6f70dc9977f2cf18f4783cf78520623a upstream. If the END_TRANSFER command hasn't completed yet, then don't send the START_TRANSFER command. The controller may not be able to start if that's the case. Some controller revisions depend on this. See commit 76a638f8ac0d ("usb: dwc3: gadget: wait for End Transfer to complete"). Let's only send START_TRANSFER command after the END_TRANSFER command had completed. Fixes: 3aec99154db3 ("usb: dwc3: gadget: remove DWC3_EP_END_TRANSFER_PENDING") Signed-off-by: Thinh Nguyen Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 1dc0d21fc1e64b79a44b81ea7fcc87b173841cbe Author: Thinh Nguyen Date: Wed Dec 18 18:14:44 2019 -0800 usb: dwc3: gadget: Check END_TRANSFER completion commit c58d8bfc77a2c7f6ff6339b58c9fca7ae6f57e70 upstream. While the END_TRANSFER command is sent but not completed, any request dequeue during this time will cause the driver to issue the END_TRANSFER command. The driver needs to submit the command only once to stop the controller from processing further. The controller may take more time to process the same command multiple times unnecessarily. Let's add a flag DWC3_EP_END_TRANSFER_PENDING to check for this condition. Fixes: 3aec99154db3 ("usb: dwc3: gadget: remove DWC3_EP_END_TRANSFER_PENDING") Signed-off-by: Thinh Nguyen Signed-off-by: Felipe Balbi Signed-off-by: Greg Kroah-Hartman commit 4e5d1bf6e68f376019161afe62cd809215570dc4 Author: Navid Emamdoost Date: Sat Dec 14 19:51:14 2019 -0600 brcmfmac: Fix memory leak in brcmf_usbdev_qinit commit 4282dc057d750c6a7dd92953564b15c26b54c22c upstream. In the implementation of brcmf_usbdev_qinit() the allocated memory for reqs is leaking if usb_alloc_urb() fails. Release reqs in the error handling path. Fixes: 71bb244ba2fd ("brcm80211: fmac: add USB support for bcm43235/6/8 chipsets") Signed-off-by: Navid Emamdoost Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit 1c8c75275f9777a162fbbc4da4e02edb0ce60834 Author: Kai-Heng Feng Date: Thu Dec 5 17:07:01 2019 +0800 Bluetooth: btusb: Disable runtime suspend on Realtek devices commit 7ecacafc240638148567742cca41aa7144b4fe1e upstream. After commit 9e45524a0111 ("Bluetooth: btusb: Fix suspend issue for Realtek devices") both WiFi and Bluetooth stop working after reboot: [ 34.322617] usb 1-8: reset full-speed USB device number 3 using xhci_hcd [ 34.450401] usb 1-8: device descriptor read/64, error -71 [ 34.694375] usb 1-8: device descriptor read/64, error -71 ... [ 44.599111] rtw_pci 0000:02:00.0: failed to poll offset=0x5 mask=0x3 value=0x0 [ 44.599113] rtw_pci 0000:02:00.0: mac power on failed [ 44.599114] rtw_pci 0000:02:00.0: failed to power on mac [ 44.599114] rtw_pci 0000:02:00.0: leave idle state failed [ 44.599492] rtw_pci 0000:02:00.0: failed to leave ips state [ 44.599493] rtw_pci 0000:02:00.0: failed to leave idle state That commit removed USB_QUIRK_RESET_RESUME, which not only resets the USB device after resume, it also prevents the device from being runtime suspended by USB core. My experiment shows if the Realtek btusb device ever runtime suspends once, the entire wireless module becomes useless after reboot. So let's explicitly disable runtime suspend on Realtek btusb device for now. Fixes: 9e45524a0111 ("Bluetooth: btusb: Fix suspend issue for Realtek devices") Signed-off-by: Kai-Heng Feng Signed-off-by: Marcel Holtmann Signed-off-by: Greg Kroah-Hartman commit 1bfecb50771d649d27836f269c3a34cc25121a6b Author: Colin Ian King Date: Fri Nov 29 17:36:35 2019 +0000 Bluetooth: btusb: fix memory leak on fw commit 3168c19d7eb17a0108a3b60ad8e8c1b18ea05c63 upstream. Currently the error return path when the call to btusb_mtk_hci_wmt_sync fails does not free fw. Fix this by returning via the error_release_fw label that performs the free'ing. Addresses-Coverity: ("Resource leak") Fixes: a1c49c434e15 ("Bluetooth: btusb: Add protocol support for MediaTek MT7668U USB devices") Signed-off-by: Colin Ian King Signed-off-by: Marcel Holtmann Signed-off-by: Greg Kroah-Hartman commit 21780d1fd65b8dd4dbc1c2e4d721413430b1618a Author: Israel Rukshin Date: Tue Feb 4 14:38:10 2020 +0200 nvmet: Fix controller use after free commit 1a3f540d63152b8db0a12de508bfa03776217d83 upstream. After nvmet_install_queue() sets sq->ctrl calling to nvmet_sq_destroy() reduces the controller refcount. In case nvmet_install_queue() fails, calling to nvmet_ctrl_put() is done twice (at nvmet_sq_destroy and nvmet_execute_io_connect/nvmet_execute_admin_connect) instead of once for the queue which leads to use after free of the controller. Fix this by set NULL at sq->ctrl in case of a failure at nvmet_install_queue(). The bug leads to the following Call Trace: [65857.994862] refcount_t: underflow; use-after-free. [65858.108304] Workqueue: events nvmet_rdma_release_queue_work [nvmet_rdma] [65858.115557] RIP: 0010:refcount_warn_saturate+0xe5/0xf0 [65858.208141] Call Trace: [65858.211203] nvmet_sq_destroy+0xe1/0xf0 [nvmet] [65858.216383] nvmet_rdma_release_queue_work+0x37/0xf0 [nvmet_rdma] [65858.223117] process_one_work+0x167/0x370 [65858.227776] worker_thread+0x49/0x3e0 [65858.232089] kthread+0xf5/0x130 [65858.235895] ? max_active_store+0x80/0x80 [65858.240504] ? kthread_bind+0x10/0x10 [65858.244832] ret_from_fork+0x1f/0x30 [65858.249074] ---[ end trace f82d59250b54beb7 ]--- Fixes: bb1cc74790eb ("nvmet: implement valid sqhd values in completions") Fixes: 1672ddb8d691 ("nvmet: Add install_queue callout") Signed-off-by: Israel Rukshin Reviewed-by: Max Gurtovoy Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Signed-off-by: Greg Kroah-Hartman commit 6243cb9e32d21e842329dba13f3b4e7bd3cc3148 Author: Israel Rukshin Date: Tue Feb 4 14:38:09 2020 +0200 nvmet: Fix error print message at nvmet_install_queue function commit 0b87a2b795d66be7b54779848ef0f3901c5e46fc upstream. Place the arguments in the correct order. Fixes: 1672ddb8d691 ("nvmet: Add install_queue callout") Signed-off-by: Israel Rukshin Reviewed-by: Max Gurtovoy Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Signed-off-by: Greg Kroah-Hartman commit 6a4fea54ab46c2623d74449eaaff8a7e08b19c5e Author: Paul E. McKenney Date: Mon Nov 4 08:22:45 2019 -0800 rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special() commit c51f83c315c392d9776c33eb16a2fe1349d65c7f upstream. The rcu_node structure's ->expmask field is updated only when holding the ->lock, but is also accessed locklessly. This means that all ->expmask updates must use WRITE_ONCE() and all reads carried out without holding ->lock must use READ_ONCE(). This commit therefore changes the lockless ->expmask read in rcu_read_unlock_special() to use READ_ONCE(). Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com Signed-off-by: Paul E. McKenney Acked-by: Marco Elver Signed-off-by: Greg Kroah-Hartman commit c71706a5ffff80c580f91ee683ce3ebfc79c2998 Author: Paul E. McKenney Date: Mon Nov 4 08:08:30 2019 -0800 srcu: Apply *_ONCE() to ->srcu_last_gp_end commit 844a378de3372c923909681706d62336d702531e upstream. The ->srcu_last_gp_end field is accessed from any CPU at any time by synchronize_srcu(), so non-initialization references need to use READ_ONCE() and WRITE_ONCE(). This commit therefore makes that change. Reported-by: syzbot+08f3e9d26e5541e1ecf2@syzkaller.appspotmail.com Acked-by: Marco Elver Signed-off-by: Paul E. McKenney Signed-off-by: Greg Kroah-Hartman commit dcad7270b2c72c23dda243d832aff33cb189fc76 Author: Eric Dumazet Date: Wed Oct 9 14:21:54 2019 -0700 rcu: Avoid data-race in rcu_gp_fqs_check_wake() commit 6935c3983b246d5fbfebd3b891c825e65c118f2d upstream. The rcu_gp_fqs_check_wake() function uses rcu_preempt_blocked_readers_cgp() to read ->gp_tasks while other cpus might overwrite this field. We need READ_ONCE()/WRITE_ONCE() pairs to avoid compiler tricks and KCSAN splats like the following : BUG: KCSAN: data-race in rcu_gp_fqs_check_wake / rcu_preempt_deferred_qs_irqrestore write to 0xffffffff85a7f190 of 8 bytes by task 7317 on cpu 0: rcu_preempt_deferred_qs_irqrestore+0x43d/0x580 kernel/rcu/tree_plugin.h:507 rcu_read_unlock_special+0xec/0x370 kernel/rcu/tree_plugin.h:659 __rcu_read_unlock+0xcf/0xe0 kernel/rcu/tree_plugin.h:394 rcu_read_unlock include/linux/rcupdate.h:645 [inline] __ip_queue_xmit+0x3b0/0xa40 net/ipv4/ip_output.c:533 ip_queue_xmit+0x45/0x60 include/net/ip.h:236 __tcp_transmit_skb+0xdeb/0x1cd0 net/ipv4/tcp_output.c:1158 __tcp_send_ack+0x246/0x300 net/ipv4/tcp_output.c:3685 tcp_send_ack+0x34/0x40 net/ipv4/tcp_output.c:3691 tcp_cleanup_rbuf+0x130/0x360 net/ipv4/tcp.c:1575 tcp_recvmsg+0x633/0x1a30 net/ipv4/tcp.c:2179 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:871 [inline] sock_recvmsg net/socket.c:889 [inline] sock_recvmsg+0x92/0xb0 net/socket.c:885 sock_read_iter+0x15f/0x1e0 net/socket.c:967 call_read_iter include/linux/fs.h:1864 [inline] new_sync_read+0x389/0x4f0 fs/read_write.c:414 read to 0xffffffff85a7f190 of 8 bytes by task 10 on cpu 1: rcu_gp_fqs_check_wake kernel/rcu/tree.c:1556 [inline] rcu_gp_fqs_check_wake+0x93/0xd0 kernel/rcu/tree.c:1546 rcu_gp_fqs_loop+0x36c/0x580 kernel/rcu/tree.c:1611 rcu_gp_kthread+0x143/0x220 kernel/rcu/tree.c:1768 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 10 Comm: rcu_preempt Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet Reported-by: syzbot [ paulmck: Added another READ_ONCE() for RCU CPU stall warnings. ] Signed-off-by: Paul E. McKenney Signed-off-by: Greg Kroah-Hartman commit a523031513b755ab7b3b556b78ff0a385cc3498c Author: Paul E. McKenney Date: Mon Oct 7 18:53:18 2019 -0700 rcu: Use *_ONCE() to protect lockless ->expmask accesses commit 15c7c972cd26d89a26788e609c53b5a465324a6c upstream. The rcu_node structure's ->expmask field is accessed locklessly when starting a new expedited grace period and when reporting an expedited RCU CPU stall warning. This commit therefore handles the former by taking a snapshot of ->expmask while the lock is held and the latter by applying READ_ONCE() to lockless reads and WRITE_ONCE() to the corresponding updates. Link: https://lore.kernel.org/lkml/CANpmjNNmSOagbTpffHr4=Yedckx9Rm2NuGqC9UqE+AOz5f1-ZQ@mail.gmail.com Reported-by: syzbot+134336b86f728d6e55a0@syzkaller.appspotmail.com Signed-off-by: Paul E. McKenney Acked-by: Marco Elver Signed-off-by: Greg Kroah-Hartman commit d42b2370f28abda0c4149be4ff0361e97ddd16d8 Author: Mathieu Desnoyers Date: Sat Aug 17 10:12:08 2019 -0400 tracing: Fix sched switch start/stop refcount racy updates commit 64ae572bc7d0060429e40e1c8d803ce5eb31a0d6 upstream. Reading the sched_cmdline_ref and sched_tgid_ref initial state within tracing_start_sched_switch without holding the sched_register_mutex is racy against concurrent updates, which can lead to tracepoint probes being registered more than once (and thus trigger warnings within tracepoint.c). [ May be the fix for this bug ] Link: https://lore.kernel.org/r/000000000000ab6f84056c786b93@google.com Link: http://lkml.kernel.org/r/20190817141208.15226-1-mathieu.desnoyers@efficios.com Cc: stable@vger.kernel.org CC: Steven Rostedt (VMware) CC: Joel Fernandes (Google) CC: Peter Zijlstra CC: Thomas Gleixner CC: Paul E. McKenney Reported-by: syzbot+774fddf07b7ab29a1e55@syzkaller.appspotmail.com Fixes: d914ba37d7145 ("tracing: Add support for recording tgid of tasks") Signed-off-by: Mathieu Desnoyers Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit e39351c52efbee8623f6f845c7cb6828028aab9b Author: Steven Rostedt (VMware) Date: Fri Jan 24 10:07:42 2020 -0500 tracing/kprobes: Have uname use __get_str() in print_fmt commit 20279420ae3a8ef4c5d9fedc360a2c37a1dbdf1b upstream. Thomas Richter reported: > Test case 66 'Use vfs_getname probe to get syscall args filenames' > is broken on s390, but works on x86. The test case fails with: > > [root@m35lp76 perf]# perf test -F 66 > 66: Use vfs_getname probe to get syscall args filenames > :Recording open file: > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.004 MB /tmp/__perf_test.perf.data.TCdYj\ > (20 samples) ] > Looking at perf.data file for vfs_getname records for the file we touched: > FAILED! > [root@m35lp76 perf]# The root cause was the print_fmt of the kprobe event that referenced the "ustring" > Setting up the kprobe event using perf command: > > # ./perf probe "vfs_getname=getname_flags:72 pathname=filename:ustring" > > generates this format file: > [root@m35lp76 perf]# cat /sys/kernel/debug/tracing/events/probe/\ > vfs_getname/format > name: vfs_getname > ID: 1172 > format: > field:unsigned short common_type; offset:0; size:2; signed:0; > field:unsigned char common_flags; offset:2; size:1; signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:unsigned long __probe_ip; offset:8; size:8; signed:0; > field:__data_loc char[] pathname; offset:16; size:4; signed:1; > > print fmt: "(%lx) pathname=\"%s\"", REC->__probe_ip, REC->pathname Instead of using "__get_str(pathname)" it referenced it directly. Link: http://lkml.kernel.org/r/20200124100742.4050c15e@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 88903c464321 ("tracing/probe: Add ustring type for user-space string") Acked-by: Masami Hiramatsu Reported-by: Thomas Richter Tested-by: Thomas Richter Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 59b2e64b16bb46dd1bfed0058a9744f90e226d9f Author: Lu Shuaibing Date: Mon Feb 3 17:34:46 2020 -0800 ipc/msg.c: consolidate all xxxctl_down() functions commit 889b331724c82c11e15ba0a60979cf7bded0a26c upstream. A use of uninitialized memory in msgctl_down() because msqid64 in ksys_msgctl hasn't been initialized. The local | msqid64 | is created in ksys_msgctl() and then passed into msgctl_down(). Along the way msqid64 is never initialized before msgctl_down() checks msqid64->msg_qbytes. KUMSAN(KernelUninitializedMemorySantizer, a new error detection tool) reports: ================================================================== BUG: KUMSAN: use of uninitialized memory in msgctl_down+0x94/0x300 Read of size 8 at addr ffff88806bb97eb8 by task syz-executor707/2022 CPU: 0 PID: 2022 Comm: syz-executor707 Not tainted 5.2.0-rc4+ #63 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: dump_stack+0x75/0xae __kumsan_report+0x17c/0x3e6 kumsan_report+0xe/0x20 msgctl_down+0x94/0x300 ksys_msgctl.constprop.14+0xef/0x260 do_syscall_64+0x7e/0x1f0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x4400e9 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffd869e0598 EFLAGS: 00000246 ORIG_RAX: 0000000000000047 RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004400e9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000246 R12: 0000000000401970 R13: 0000000000401a00 R14: 0000000000000000 R15: 0000000000000000 The buggy address belongs to the page: page:ffffea0001aee5c0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x100000000000000() raw: 0100000000000000 0000000000000000 ffffffff01ae0101 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kumsan: bad access detected ================================================================== Syzkaller reproducer: msgctl$IPC_RMID(0x0, 0x0) C reproducer: // autogenerated by syzkaller (https://github.com/google/syzkaller) int main(void) { syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0); syscall(__NR_msgctl, 0, 0, 0); return 0; } [natechancellor@gmail.com: adjust indentation in ksys_msgctl] Link: https://github.com/ClangBuiltLinux/linux/issues/829 Link: http://lkml.kernel.org/r/20191218032932.37479-1-natechancellor@gmail.com Link: http://lkml.kernel.org/r/20190613014044.24234-1-shuaibinglu@126.com Signed-off-by: Lu Shuaibing Signed-off-by: Nathan Chancellor Suggested-by: Arnd Bergmann Cc: Davidlohr Bueso Cc: Manfred Spraul Cc: NeilBrown From: Andrew Morton Subject: ipc/msg.c: consolidate all xxxctl_down() functions Each line here overflows 80 cols by exactly one character. Delete one tab per line to fix. Cc: Shaohua Li Cc: Jens Axboe Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 8ce07d95d67a9f4236bb44b6c7046a9b0a8d715b Author: Kadlecsik József Date: Sat Jan 25 20:39:25 2020 +0100 netfilter: ipset: fix suspicious RCU usage in find_set_and_id commit 5038517119d50ed0240059b1d7fc2faa92371c08 upstream. find_set_and_id() is called when the NFNL_SUBSYS_IPSET mutex is held. However, in the error path there can be a follow-up recvmsg() without the mutex held. Use the start() function of struct netlink_dump_control instead of dump() to verify and report if the specified set does not exist. Thanks to Pablo Neira Ayuso for helping me to understand the subleties of the netlink protocol. Reported-by: syzbot+fc69d7cb21258ab4ae4d@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 7bad0dda8164c4cd6f1c43444089fb9c4188cdf4 Author: Oliver Neukum Date: Thu Nov 21 11:28:10 2019 +0100 mfd: dln2: More sanity checking for endpoints commit 2b8bd606b1e60ca28c765f69c1eedd7d2a2e9dca upstream. It is not enough to check for the number of endpoints. The types must also be correct. Reported-and-tested-by: syzbot+48a2851be24583b864dc@syzkaller.appspotmail.com Signed-off-by: Oliver Neukum Reviewed-by: Greg Kroah-Hartman Signed-off-by: Lee Jones Signed-off-by: Greg Kroah-Hartman commit 6fcbff54ded118b29ca05f56aea85825d24a5645 Author: Will Deacon Date: Fri Nov 8 16:48:38 2019 +0100 media: uvcvideo: Avoid cyclic entity chains due to malformed USB descriptors commit 68035c80e129c4cfec659aac4180354530b26527 upstream. Way back in 2017, fuzzing the 4.14-rc2 USB stack with syzkaller kicked up the following WARNING from the UVC chain scanning code: | list_add double add: new=ffff880069084010, prev=ffff880069084010, | next=ffff880067d22298. | ------------[ cut here ]------------ | WARNING: CPU: 1 PID: 1846 at lib/list_debug.c:31 __list_add_valid+0xbd/0xf0 | Modules linked in: | CPU: 1 PID: 1846 Comm: kworker/1:2 Not tainted | 4.14.0-rc2-42613-g1488251d1a98 #238 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 | Workqueue: usb_hub_wq hub_event | task: ffff88006b01ca40 task.stack: ffff880064358000 | RIP: 0010:__list_add_valid+0xbd/0xf0 lib/list_debug.c:29 | RSP: 0018:ffff88006435ddd0 EFLAGS: 00010286 | RAX: 0000000000000058 RBX: ffff880067d22298 RCX: 0000000000000000 | RDX: 0000000000000058 RSI: ffffffff85a58800 RDI: ffffed000c86bbac | RBP: ffff88006435dde8 R08: 1ffff1000c86ba52 R09: 0000000000000000 | R10: 0000000000000002 R11: 0000000000000000 R12: ffff880069084010 | R13: ffff880067d22298 R14: ffff880069084010 R15: ffff880067d222a0 | FS: 0000000000000000(0000) GS:ffff88006c900000(0000) knlGS:0000000000000000 | CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 | CR2: 0000000020004ff2 CR3: 000000006b447000 CR4: 00000000000006e0 | Call Trace: | __list_add ./include/linux/list.h:59 | list_add_tail+0x8c/0x1b0 ./include/linux/list.h:92 | uvc_scan_chain_forward.isra.8+0x373/0x416 | drivers/media/usb/uvc/uvc_driver.c:1471 | uvc_scan_chain drivers/media/usb/uvc/uvc_driver.c:1585 | uvc_scan_device drivers/media/usb/uvc/uvc_driver.c:1769 | uvc_probe+0x77f2/0x8f00 drivers/media/usb/uvc/uvc_driver.c:2104 Looking into the output from usbmon, the interesting part is the following data packet: ffff880069c63e00 30710169 C Ci:1:002:0 0 143 = 09028f00 01030080 00090403 00000e01 00000924 03000103 7c003328 010204db If we drop the lead configuration and interface descriptors, we're left with an output terminal descriptor describing a generic display: /* Output terminal descriptor */ buf[0] 09 buf[1] 24 buf[2] 03 /* UVC_VC_OUTPUT_TERMINAL */ buf[3] 00 /* ID */ buf[4] 01 /* type == 0x0301 (UVC_OTT_DISPLAY) */ buf[5] 03 buf[6] 7c buf[7] 00 /* source ID refers to self! */ buf[8] 33 The problem with this descriptor is that it is self-referential: the source ID of 0 matches itself! This causes the 'struct uvc_entity' representing the display to be added to its chain list twice during 'uvc_scan_chain()': once via 'uvc_scan_chain_entity()' when it is processed directly from the 'dev->entities' list and then again immediately afterwards when trying to follow the source ID in 'uvc_scan_chain_forward()' Add a check before adding an entity to a chain list to ensure that the entity is not already part of a chain. Link: https://lore.kernel.org/linux-media/CAAeHK+z+Si69jUR+N-SjN9q4O+o5KFiNManqEa-PjUta7EOb7A@mail.gmail.com/ Cc: Fixes: c0efd232929c ("V4L/DVB (8145a): USB Video Class driver") Reported-by: Andrey Konovalov Signed-off-by: Will Deacon Signed-off-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 9f5c4fac341ceefdd38d9bd477a2bd34d65610d4 Author: Vasundhara Volam Date: Sun Feb 2 02:41:37 2020 -0500 bnxt_en: Fix logic that disables Bus Master during firmware reset. [ Upstream commit d407302895d3f3ca3a333c711744a95e0b1b0150 ] The current logic that calls pci_disable_device() in __bnxt_close_nic() during firmware reset is flawed. If firmware is still alive, we're disabling the device too early, causing some firmware commands to not reach the firmware. Fix it by moving the logic to bnxt_reset_close(). If firmware is in fatal condition, we call pci_disable_device() before we free any of the rings to prevent DMA corruption of the freed rings. If firmware is still alive, we call pci_disable_device() after the last firmware message has been sent. Fixes: 3bc7d4a352ef ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.") Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit fddd3f73ad0891dcafd78a6b8595ea7dda30bcab Author: Taehee Yoo Date: Sat Feb 1 16:43:22 2020 +0000 netdevsim: fix stack-out-of-bounds in nsim_dev_debugfs_init() [ Upstream commit 6fb8852b1298200da39bd85788bc5755d1d56f32 ] When netdevsim dev is being created, a debugfs directory is created. The variable "dev_ddir_name" is 16bytes device name pointer and device name is "netdevsim". The maximum dev id length is 10. So, 16bytes for device name isn't enough. Test commands: modprobe netdevsim echo "1000000000 0" > /sys/bus/netdevsim/new_device Splat looks like: [ 249.622710][ T900] BUG: KASAN: stack-out-of-bounds in number+0x824/0x880 [ 249.623658][ T900] Write of size 1 at addr ffff88804c527988 by task bash/900 [ 249.624521][ T900] [ 249.624830][ T900] CPU: 1 PID: 900 Comm: bash Not tainted 5.5.0+ #322 [ 249.625691][ T900] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 249.626712][ T900] Call Trace: [ 249.627103][ T900] dump_stack+0x96/0xdb [ 249.627639][ T900] ? number+0x824/0x880 [ 249.628173][ T900] print_address_description.constprop.5+0x1be/0x360 [ 249.629022][ T900] ? number+0x824/0x880 [ 249.629569][ T900] ? number+0x824/0x880 [ 249.630105][ T900] __kasan_report+0x12a/0x170 [ 249.630717][ T900] ? number+0x824/0x880 [ 249.631201][ T900] kasan_report+0xe/0x20 [ 249.631723][ T900] number+0x824/0x880 [ 249.632235][ T900] ? put_dec+0xa0/0xa0 [ 249.632716][ T900] ? rcu_read_lock_sched_held+0x90/0xc0 [ 249.633392][ T900] vsnprintf+0x63c/0x10b0 [ 249.633983][ T900] ? pointer+0x5b0/0x5b0 [ 249.634543][ T900] ? mark_lock+0x11d/0xc40 [ 249.635200][ T900] sprintf+0x9b/0xd0 [ 249.635750][ T900] ? scnprintf+0xe0/0xe0 [ 249.636370][ T900] nsim_dev_probe+0x63c/0xbf0 [netdevsim] [ ... ] Reviewed-by: Jakub Kicinski Fixes: ab1d0cc004d7 ("netdevsim: change debugfs tree topology") Signed-off-by: Taehee Yoo Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f146529c00499ea251d0e9679ba702af1f8a19f9 Author: Lukas Bulwahn Date: Sat Feb 1 13:43:01 2020 +0100 MAINTAINERS: correct entries for ISDN/mISDN section [ Upstream commit dff6bc1bfd462b76dc13ec19dedc2c134a62ac59 ] Commit 6d97985072dc ("isdn: move capi drivers to staging") cleaned up the isdn drivers and split the MAINTAINERS section for ISDN, but missed to add the terminal slash for the two directories mISDN and hardware. Hence, all files in those directories were not part of the new ISDN/mISDN SUBSYSTEM, but were considered to be part of "THE REST". Rectify the situation, and while at it, also complete the section with two further build files that belong to that subsystem. This was identified with a small script that finds all files belonging to "THE REST" according to the current MAINTAINERS file, and I investigated upon its output. Fixes: 6d97985072dc ("isdn: move capi drivers to staging") Signed-off-by: Lukas Bulwahn Acked-by: Arnd Bergmann Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 096df4720ab6e7d8d79035d1466fc8cac320fc2b Author: Shannon Nelson Date: Thu Jan 30 10:07:06 2020 -0800 ionic: fix rxq comp packet type mask [ Upstream commit b5ce31b5e11b768b7d685b2bab7db09ad5549493 ] Be sure to include all the packet type bits in the mask. Fixes: fbfb8031533c ("ionic: Add hardware init and device commands") Signed-off-by: Shannon Nelson Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b6a7ba0e8622caae175be80931325233cf970369 Author: Eric Dumazet Date: Fri Jan 31 10:44:50 2020 -0800 tcp: clear tp->segs_{in|out} in tcp_disconnect() [ Upstream commit 784f8344de750a41344f4bbbebb8507a730fc99c ] tp->segs_in and tp->segs_out need to be cleared in tcp_disconnect(). tcp_disconnect() is rarely used, but it is worth fixing it. Fixes: 2efd055c53c0 ("tcp: add tcpi_segs_in and tcpi_segs_out to tcp_info") Signed-off-by: Eric Dumazet Cc: Marcelo Ricardo Leitner Cc: Yuchung Cheng Cc: Neal Cardwell Acked-by: Neal Cardwell Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 31fceaf085978d7c2373ea25f4be641de42b69b8 Author: Eric Dumazet Date: Fri Jan 31 10:32:41 2020 -0800 tcp: clear tp->data_segs{in|out} in tcp_disconnect() [ Upstream commit db7ffee6f3eb3683cdcaeddecc0a630a14546fe3 ] tp->data_segs_in and tp->data_segs_out need to be cleared in tcp_disconnect(). tcp_disconnect() is rarely used, but it is worth fixing it. Fixes: a44d6eacdaf5 ("tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In") Signed-off-by: Eric Dumazet Cc: Martin KaFai Lau Cc: Yuchung Cheng Cc: Neal Cardwell Acked-by: Neal Cardwell Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 2fc4773b6af753f98c5fdecd9f520c02e85267b6 Author: Eric Dumazet Date: Fri Jan 31 10:22:47 2020 -0800 tcp: clear tp->delivered in tcp_disconnect() [ Upstream commit 2fbdd56251b5c62f96589f39eded277260de7267 ] tp->delivered needs to be cleared in tcp_disconnect(). tcp_disconnect() is rarely used, but it is worth fixing it. Fixes: ddf1af6fa00e ("tcp: new delivery accounting") Signed-off-by: Eric Dumazet Cc: Yuchung Cheng Cc: Neal Cardwell Acked-by: Yuchung Cheng Acked-by: Neal Cardwell Acked-by: Soheil Hassas Yeganeh Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit c9b6c6d07e5cb65d1ef2fee6f2c13aeb88ba0d53 Author: Eric Dumazet Date: Fri Jan 31 09:14:47 2020 -0800 tcp: clear tp->total_retrans in tcp_disconnect() [ Upstream commit c13c48c00a6bc1febc73902505bdec0967bd7095 ] total_retrans needs to be cleared in tcp_disconnect(). tcp_disconnect() is rarely used, but it is worth fixing it. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet Cc: SeongJae Park Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 269a3c472a6fa53837aee596d909cef73c2560cf Author: David Howells Date: Thu Jan 30 21:50:36 2020 +0000 rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect [ Upstream commit 5273a191dca65a675dc0bcf3909e59c6933e2831 ] When a call is disconnected, the connection pointer from the call is cleared to make sure it isn't used again and to prevent further attempted transmission for the call. Unfortunately, there might be a daemon trying to use it at the same time to transmit a packet. Fix this by keeping call->conn set, but setting a flag on the call to indicate disconnection instead. Remove also the bits in the transmission functions where the conn pointer is checked and a ref taken under spinlock as this is now redundant. Fixes: 8d94aa381dab ("rxrpc: Calls shouldn't hold socket refs") Signed-off-by: David Howells Signed-off-by: Greg Kroah-Hartman commit 843e115de475bd25303eed8173997a43ed09dcc8 Author: David Howells Date: Thu Jan 30 21:50:36 2020 +0000 rxrpc: Fix missing active use pinning of rxrpc_local object [ Upstream commit 04d36d748fac349b068ef621611f454010054c58 ] The introduction of a split between the reference count on rxrpc_local objects and the usage count didn't quite go far enough. A number of kernel work items need to make use of the socket to perform transmission. These also need to get an active count on the local object to prevent the socket from being closed. Fix this by getting the active count in those places. Also split out the raw active count get/put functions as these places tend to hold refs on the rxrpc_local object already, so getting and putting an extra object ref is just a waste of time. The problem can lead to symptoms like: BUG: kernel NULL pointer dereference, address: 0000000000000018 .. CPU: 2 PID: 818 Comm: kworker/u9:0 Not tainted 5.5.0-fscache+ #51 ... RIP: 0010:selinux_socket_sendmsg+0x5/0x13 ... Call Trace: security_socket_sendmsg+0x2c/0x3e sock_sendmsg+0x1a/0x46 rxrpc_send_keepalive+0x131/0x1ae rxrpc_peer_keepalive_worker+0x219/0x34b process_one_work+0x18e/0x271 worker_thread+0x1a3/0x247 kthread+0xe6/0xeb ret_from_fork+0x1f/0x30 Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting") Signed-off-by: David Howells Signed-off-by: Greg Kroah-Hartman commit 524cba2f8de57d612eee67a3fe5f672cc39f549c Author: David Howells Date: Thu Jan 30 21:50:36 2020 +0000 rxrpc: Fix insufficient receive notification generation [ Upstream commit f71dbf2fb28489a79bde0dca1c8adfb9cdb20a6b ] In rxrpc_input_data(), rxrpc_notify_socket() is called if the base sequence number of the packet is immediately following the hard-ack point at the end of the function. However, this isn't sufficient, since the recvmsg side may have been advancing the window and then overrun the position in which we're adding - at which point rx_hard_ack >= seq0 and no notification is generated. Fix this by always generating a notification at the end of the input function. Without this, a long call may stall, possibly indefinitely. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells Signed-off-by: Greg Kroah-Hartman commit 62221a9b1c9aef316ffa493340583827c045c2c0 Author: David Howells Date: Thu Jan 30 21:50:35 2020 +0000 rxrpc: Fix use-after-free in rxrpc_put_local() [ Upstream commit fac20b9e738523fc884ee3ea5be360a321cd8bad ] Fix rxrpc_put_local() to not access local->debug_id after calling atomic_dec_return() as, unless that returned n==0, we no longer have the right to access the object. Fixes: 06d9532fa6b3 ("rxrpc: Fix read-after-free in rxrpc_queue_local()") Signed-off-by: David Howells Signed-off-by: Greg Kroah-Hartman commit 5fa06c9568214537c42b32c9ec23ae0c70a4f8b7 Author: Michael Chan Date: Sun Feb 2 02:41:38 2020 -0500 bnxt_en: Fix TC queue mapping. [ Upstream commit 18e4960c18f484ac288f41b43d0e6c4c88e6ea78 ] The driver currently only calls netdev_set_tc_queue when the number of TCs is greater than 1. Instead, the comparison should be greater than or equal to 1. Even with 1 TC, we need to set the queue mapping. This bug can cause warnings when the number of TCs is changed back to 1. Fixes: 7809592d3e2e ("bnxt_en: Enable MSIX early in bnxt_init_one().") Signed-off-by: Michael Chan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 8566221e46eb78358b9b1620c798ab9c9f5af355 Author: Nicolin Chen Date: Fri Jan 31 18:01:24 2020 -0800 net: stmmac: Delete txtimer in suspend() [ Upstream commit 14b41a2959fbaa50932699d32ceefd6643abacc6 ] When running v5.5 with a rootfs on NFS, memory abort may happen in the system resume stage: Unable to handle kernel paging request at virtual address dead00000000012a [dead00000000012a] address between user and kernel address ranges pc : run_timer_softirq+0x334/0x3d8 lr : run_timer_softirq+0x244/0x3d8 x1 : ffff800011cafe80 x0 : dead000000000122 Call trace: run_timer_softirq+0x334/0x3d8 efi_header_end+0x114/0x234 irq_exit+0xd0/0xd8 __handle_domain_irq+0x60/0xb0 gic_handle_irq+0x58/0xa8 el1_irq+0xb8/0x180 arch_cpu_idle+0x10/0x18 do_idle+0x1d8/0x2b0 cpu_startup_entry+0x24/0x40 secondary_start_kernel+0x1b4/0x208 Code: f9000693 a9400660 f9000020 b4000040 (f9000401) ---[ end trace bb83ceeb4c482071 ]--- Kernel panic - not syncing: Fatal exception in interrupt SMP: stopping secondary CPUs SMP: failed to stop secondary CPUs 2-3 Kernel Offset: disabled CPU features: 0x00002,2300aa30 Memory Limit: none ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- It's found that stmmac_xmit() and stmmac_resume() sometimes might run concurrently, possibly resulting in a race condition between mod_timer() and setup_timer(), being called by stmmac_xmit() and stmmac_resume() respectively. Since the resume() runs setup_timer() every time, it'd be safer to have del_timer_sync() in the suspend() as the counterpart. Signed-off-by: Nicolin Chen Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit dd8142a6fa5270783d415292ec8169f4ea2a5468 Author: Cong Wang Date: Sun Feb 2 21:14:35 2020 -0800 net_sched: fix an OOB access in cls_tcindex [ Upstream commit 599be01ee567b61f4471ee8078870847d0a11e8e ] As Eric noticed, tcindex_alloc_perfect_hash() uses cp->hash to compute the size of memory allocation, but cp->hash is set again after the allocation, this caused an out-of-bound access. So we have to move all cp->hash initialization and computation before the memory allocation. Move cp->mask and cp->shift together as cp->hash may need them for computation too. Reported-and-tested-by: syzbot+35d4dea36c387813ed31@syzkaller.appspotmail.com Fixes: 331b72922c5f ("net: sched: RCU cls_tcindex") Cc: Eric Dumazet Cc: John Fastabend Cc: Jamal Hadi Salim Cc: Jiri Pirko Cc: Jakub Kicinski Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 919f13c56485f078026b8cabdcc04363be370f37 Author: Eric Dumazet Date: Mon Feb 3 10:15:07 2020 -0800 net: hsr: fix possible NULL deref in hsr_handle_frame() [ Upstream commit 2b5b8251bc9fe2f9118411f037862ee17cf81e97 ] hsr_port_get_rcu() can return NULL, so we need to be careful. general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] CPU: 1 PID: 10249 Comm: syz-executor.5 Not tainted 5.5.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:199 [inline] RIP: 0010:hsr_addr_is_self+0x86/0x330 net/hsr/hsr_framereg.c:44 Code: 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 6b ff 94 f9 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 75 02 00 00 48 8b 43 30 49 39 c6 49 89 47 c0 0f RSP: 0018:ffffc90000da8a90 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff87e0cc33 RDX: 0000000000000006 RSI: ffffffff87e035d5 RDI: 0000000000000000 RBP: ffffc90000da8b20 R08: ffff88808e7de040 R09: ffffed1015d2707c R10: ffffed1015d2707b R11: ffff8880ae9383db R12: ffff8880a689bc5e R13: 1ffff920001b5153 R14: 0000000000000030 R15: ffffc90000da8af8 FS: 00007fd7a42be700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32338000 CR3: 00000000a928c000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: hsr_handle_frame+0x1c5/0x630 net/hsr/hsr_slave.c:31 __netif_receive_skb_core+0xfbc/0x30b0 net/core/dev.c:5099 __netif_receive_skb_one_core+0xa8/0x1a0 net/core/dev.c:5196 __netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5312 process_backlog+0x206/0x750 net/core/dev.c:6144 napi_poll net/core/dev.c:6582 [inline] net_rx_action+0x508/0x1120 net/core/dev.c:6650 __do_softirq+0x262/0x98c kernel/softirq.c:292 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082 Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f3dea4cea67ab67585e48e87136c3f66c3bae1e5 Author: Ridge Kennedy Date: Tue Feb 4 12:24:00 2020 +1300 l2tp: Allow duplicate session creation with UDP [ Upstream commit 0d0d9a388a858e271bb70e71e99e7fe2a6fd6f64 ] In the past it was possible to create multiple L2TPv3 sessions with the same session id as long as the sessions belonged to different tunnels. The resulting sessions had issues when used with IP encapsulated tunnels, but worked fine with UDP encapsulated ones. Some applications began to rely on this behaviour to avoid having to negotiate unique session ids. Some time ago a change was made to require session ids to be unique across all tunnels, breaking the applications making use of this "feature". This change relaxes the duplicate session id check to allow duplicates if both of the colliding sessions belong to UDP encapsulated tunnels. Fixes: dbdbc73b4478 ("l2tp: fix duplicate session creation") Signed-off-by: Ridge Kennedy Acked-by: James Chapman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b080bc8481d952a11f76bf87fbe68aeee019b093 Author: Taehee Yoo Date: Tue Feb 4 03:24:59 2020 +0000 gtp: use __GFP_NOWARN to avoid memalloc warning [ Upstream commit bd5cd35b782abf5437fbd01dfaee12437d20e832 ] gtp hashtable size is received by user-space. So, this hashtable size could be too large. If so, kmalloc will internally print a warning message. This warning message is actually not necessary for the gtp module. So, this patch adds __GFP_NOWARN to avoid this message. Splat looks like: [ 2171.200049][ T1860] WARNING: CPU: 1 PID: 1860 at mm/page_alloc.c:4713 __alloc_pages_nodemask+0x2f3/0x740 [ 2171.238885][ T1860] Modules linked in: gtp veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv] [ 2171.262680][ T1860] CPU: 1 PID: 1860 Comm: gtp-link Not tainted 5.5.0+ #321 [ 2171.263567][ T1860] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 2171.264681][ T1860] RIP: 0010:__alloc_pages_nodemask+0x2f3/0x740 [ 2171.265332][ T1860] Code: 64 fe ff ff 65 48 8b 04 25 c0 0f 02 00 48 05 f0 12 00 00 41 be 01 00 00 00 49 89 47 0 [ 2171.267301][ T1860] RSP: 0018:ffff8880b51af1f0 EFLAGS: 00010246 [ 2171.268320][ T1860] RAX: ffffed1016a35e43 RBX: 0000000000000000 RCX: 0000000000000000 [ 2171.269517][ T1860] RDX: 0000000000000000 RSI: 000000000000000b RDI: 0000000000000000 [ 2171.270305][ T1860] RBP: 0000000000040cc0 R08: ffffed1018893109 R09: dffffc0000000000 [ 2171.275973][ T1860] R10: 0000000000000001 R11: ffffed1018893108 R12: 1ffff11016a35e43 [ 2171.291039][ T1860] R13: 000000000000000b R14: 000000000000000b R15: 00000000000f4240 [ 2171.292328][ T1860] FS: 00007f53cbc83740(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000 [ 2171.293409][ T1860] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2171.294586][ T1860] CR2: 000055f540014508 CR3: 00000000b49f2004 CR4: 00000000000606e0 [ 2171.295424][ T1860] Call Trace: [ 2171.295756][ T1860] ? mark_held_locks+0xa5/0xe0 [ 2171.296659][ T1860] ? __alloc_pages_slowpath+0x21b0/0x21b0 [ 2171.298283][ T1860] ? gtp_encap_enable_socket+0x13e/0x400 [gtp] [ 2171.298962][ T1860] ? alloc_pages_current+0xc1/0x1a0 [ 2171.299475][ T1860] kmalloc_order+0x22/0x80 [ 2171.299936][ T1860] kmalloc_order_trace+0x1d/0x140 [ 2171.300437][ T1860] __kmalloc+0x302/0x3a0 [ 2171.300896][ T1860] gtp_newlink+0x293/0xba0 [gtp] [ ... ] Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0f8f0429a299e0e3d8542420c153959dfc844b94 Author: Eric Dumazet Date: Fri Jan 31 15:27:04 2020 -0800 cls_rsvp: fix rsvp_policy [ Upstream commit cb3c0e6bdf64d0d124e94ce43cbe4ccbb9b37f51 ] NLA_BINARY can be confusing, since .len value represents the max size of the blob. cls_rsvp really wants user space to provide long enough data for TCA_RSVP_DST and TCA_RSVP_SRC attributes. BUG: KMSAN: uninit-value in rsvp_get net/sched/cls_rsvp.h:258 [inline] BUG: KMSAN: uninit-value in gen_handle net/sched/cls_rsvp.h:402 [inline] BUG: KMSAN: uninit-value in rsvp_change+0x1ae9/0x4220 net/sched/cls_rsvp.h:572 CPU: 1 PID: 13228 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 rsvp_get net/sched/cls_rsvp.h:258 [inline] gen_handle net/sched/cls_rsvp.h:402 [inline] rsvp_change+0x1ae9/0x4220 net/sched/cls_rsvp.h:572 tc_new_tfilter+0x31fe/0x5010 net/sched/cls_api.c:2104 rtnetlink_rcv_msg+0xcb7/0x1570 net/core/rtnetlink.c:5415 netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg net/socket.c:659 [inline] ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330 ___sys_sendmsg net/socket.c:2384 [inline] __sys_sendmsg+0x451/0x5f0 net/socket.c:2417 __do_sys_sendmsg net/socket.c:2426 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45b349 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f269d43dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f269d43e6d4 RCX: 000000000045b349 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000000009c2 R14: 00000000004cb338 R15: 000000000075bfd4 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2774 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4382 __kmalloc_reserve net/core/skbuff.c:141 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209 alloc_skb include/linux/skbuff.h:1049 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline] netlink_sendmsg+0x7d3/0x14d0 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg net/socket.c:659 [inline] ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330 ___sys_sendmsg net/socket.c:2384 [inline] __sys_sendmsg+0x451/0x5f0 net/socket.c:2417 __do_sys_sendmsg net/socket.c:2426 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 6fa8c0144b77 ("[NET_SCHED]: Use nla_policy for attribute validation in classifiers") Signed-off-by: Eric Dumazet Reported-by: syzbot Acked-by: Cong Wang Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 097ef8be69a3597b8c8423c0abed0ee74175da4a Author: Vasundhara Volam Date: Mon Jan 27 04:56:22 2020 -0500 bnxt_en: Move devlink_register before registering netdev [ Upstream commit cda2cab0771183932d6ba73c5ac63bb63decdadf ] Latest kernels get the phys_port_name via devlink, if ndo_get_phys_port_name is not defined. To provide the phys_port_name correctly, register devlink before registering netdev. Also call devlink_port_type_eth_set() after registering netdev as devlink port updates the netdev structure and notifies user. Cc: Jiri Pirko Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit aaf5369c72e98215c00363ac2bc4ae573e8dff5d Author: Arnd Bergmann Date: Tue Jan 14 14:26:14 2020 +0100 sparc32: fix struct ipc64_perm type definition [ Upstream commit 34ca70ef7d3a9fa7e89151597db5e37ae1d429b4 ] As discussed in the strace issue tracker, it appears that the sparc32 sysvipc support has been broken for the past 11 years. It was however working in compat mode, which is how it must have escaped most of the regular testing. The problem is that a cleanup patch inadvertently changed the uid/gid fields in struct ipc64_perm from 32-bit types to 16-bit types in uapi headers. Both glibc and uclibc-ng still use the original types, so they should work fine with compat mode, but not natively. Change the definitions to use __kernel_uid32_t and __kernel_gid32_t again. Fixes: 83c86984bff2 ("sparc: unify ipcbuf.h") Link: https://github.com/strace/strace/issues/116 Cc: # v2.6.29 Cc: Sam Ravnborg Cc: "Dmitry V . Levin" Cc: Rich Felker Cc: libc-alpha@sourceware.org Signed-off-by: Arnd Bergmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin