commit 67f57203a5ad04c53eb9f2bd02902434aec32408 Author: Alexandre Frade Date: Sun Oct 16 01:10:35 2022 +0000 Linux 6.0.2-rt11-xanmod1 Signed-off-by: Alexandre Frade commit b6e8b10527264b4ed9093ea3a74eec6dc0b96c8e Author: Alexandre Frade Date: Fri Oct 14 00:29:43 2022 +0000 XANMOD: rcu: Change sched_setscheduler_nocheck() calls to SCHED_RR policy Signed-off-by: Alexandre Frade commit 8d6c1a69b07a3fec56f8326fe9dbab93cae0bef8 Merge: bdfeaefcf21f 16c33ae7f3d5 Author: Alexandre Frade Date: Sun Oct 16 01:04:49 2022 +0000 Merge tag 'v6.0-rt11' into 6.0-rt v6.0-rt11 commit bdfeaefcf21f30ab319dc5f8b703086209d0d9ea Author: Alexandre Frade Date: Sun Oct 16 01:03:58 2022 +0000 Revert "XANMOD: rcu: Change sched_setscheduler_nocheck() calls to SCHED_RR policy" This reverts commit ee134f7541bb038bc3e6c61cc195aa39205e79dd. commit 72d19be7f818d147743b57d76e3f4c6ae7f67cbe Author: Alexandre Frade Date: Sat Oct 15 15:08:54 2022 +0000 Linux 6.0.2-xanmod1 Signed-off-by: Alexandre Frade commit 94bc5d51e7d8a4a4e1a4393dbbada2efc1294a17 Merge: f9885bf68d2a dab08f7eecdf Author: Alexandre Frade Date: Sat Oct 15 15:07:11 2022 +0000 Merge tag 'v6.0.2' into 6.0 This is the 6.0.2 stable release commit dab08f7eecdfa2ea024dc563dc09afae137e3b38 Author: Greg Kroah-Hartman Date: Sat Oct 15 08:02:59 2022 +0200 Linux 6.0.2 Link: https://lore.kernel.org/r/20221013175146.507746257@linuxfoundation.org Tested-by: Ronald Warsow Tested-by: Justin M. Forbes Tested-by: Florian Fainelli Tested-by: Slade Watkins Tested-by: Ron Economos Tested-by: Bagas Sanjaya Tested-by: Linux Kernel Functional Testing Tested-by: Sudip Mukherjee Tested-by: Luna Jernberg Tested-by: Shuah Khan Tested-by: Jon Hunter Tested-by: Fenil Jain Tested-by: Allen Pais Tested-by: Guenter Roeck Tested-by: Rudi Heitbaum Signed-off-by: Greg Kroah-Hartman commit 6c01739c2aba19553beb20491b05515af9246f0f Author: Shunsuke Mie Date: Wed Sep 7 11:01:00 2022 +0900 misc: pci_endpoint_test: Fix pci_endpoint_test_{copy,write,read}() panic commit 8e30538eca016de8e252bef174beadecd64239f0 upstream. The dma_map_single() doesn't permit zero length mapping. It causes a follow panic. A panic was reported on arm64: [ 60.137988] ------------[ cut here ]------------ [ 60.142630] kernel BUG at kernel/dma/swiotlb.c:624! [ 60.147508] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 60.152992] Modules linked in: dw_hdmi_cec crct10dif_ce simple_bridge rcar_fdp1 vsp1 rcar_vin videobuf2_vmalloc rcar_csi2 v4l 2_mem2mem videobuf2_dma_contig videobuf2_memops pci_endpoint_test videobuf2_v4l2 videobuf2_common rcar_fcp v4l2_fwnode v4l2_asyn c videodev mc gpio_bd9571mwv max9611 pwm_rcar ccree at24 authenc libdes phy_rcar_gen3_usb3 usb_dmac display_connector pwm_bl [ 60.186252] CPU: 0 PID: 508 Comm: pcitest Not tainted 6.0.0-rc1rpci-dev+ #237 [ 60.193387] Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) [ 60.201302] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 60.208263] pc : swiotlb_tbl_map_single+0x2c0/0x590 [ 60.213149] lr : swiotlb_map+0x88/0x1f0 [ 60.216982] sp : ffff80000a883bc0 [ 60.220292] x29: ffff80000a883bc0 x28: 0000000000000000 x27: 0000000000000000 [ 60.227430] x26: 0000000000000000 x25: ffff0004c0da20d0 x24: ffff80000a1f77c0 [ 60.234567] x23: 0000000000000002 x22: 0001000040000010 x21: 000000007a000000 [ 60.241703] x20: 0000000000200000 x19: 0000000000000000 x18: 0000000000000000 [ 60.248840] x17: 0000000000000000 x16: 0000000000000000 x15: ffff0006ff7b9180 [ 60.255977] x14: ffff0006ff7b9180 x13: 0000000000000000 x12: 0000000000000000 [ 60.263113] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 [ 60.270249] x8 : 0001000000000010 x7 : ffff0004c6754b20 x6 : 0000000000000000 [ 60.277385] x5 : ffff0004c0da2090 x4 : 0000000000000000 x3 : 0000000000000001 [ 60.284521] x2 : 0000000040000000 x1 : 0000000000000000 x0 : 0000000040000010 [ 60.291658] Call trace: [ 60.294100] swiotlb_tbl_map_single+0x2c0/0x590 [ 60.298629] swiotlb_map+0x88/0x1f0 [ 60.302115] dma_map_page_attrs+0x188/0x230 [ 60.306299] pci_endpoint_test_ioctl+0x5e4/0xd90 [pci_endpoint_test] [ 60.312660] __arm64_sys_ioctl+0xa8/0xf0 [ 60.316583] invoke_syscall+0x44/0x108 [ 60.320334] el0_svc_common.constprop.0+0xcc/0xf0 [ 60.325038] do_el0_svc+0x2c/0xb8 [ 60.328351] el0_svc+0x2c/0x88 [ 60.331406] el0t_64_sync_handler+0xb8/0xc0 [ 60.335587] el0t_64_sync+0x18c/0x190 [ 60.339251] Code: 52800013 d2e00414 35fff45c d503201f (d4210000) [ 60.345344] ---[ end trace 0000000000000000 ]--- To fix it, this patch adds a checking the payload length if it is zero. Fixes: 343dc693f7b7 ("misc: pci_endpoint_test: Prevent some integer overflows") Cc: stable Signed-off-by: Shunsuke Mie Link: https://lore.kernel.org/r/20220907020100.122588-2-mie@igel.co.jp Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit 579592f2674a9bfcc7c73fa8c9d9f27ab550f646 Author: Shunsuke Mie Date: Wed Sep 7 11:00:59 2022 +0900 misc: pci_endpoint_test: Aggregate params checking for xfer commit 3e42deaac06567c7e86d287c305ccda24db4ae3d upstream. Each transfer test functions have same parameter checking code. This patch unites those to an introduced function. Signed-off-by: Shunsuke Mie Cc: stable Link: https://lore.kernel.org/r/20220907020100.122588-1-mie@igel.co.jp Signed-off-by: Greg Kroah-Hartman commit 605b64a1bffec19e5480a276a1e5de3a92432b3a Author: Cameron Gutman Date: Thu Aug 18 17:44:09 2022 +0200 Input: xpad - fix wireless 360 controller breaking after suspend commit a17b9841152e7f4621619902b347e2cc39c32996 upstream. Suspending and resuming the system can sometimes cause the out URB to get hung after a reset_resume. This causes LED setting and force feedback to break on resume. To avoid this, just drop the reset_resume callback so the USB core rebinds xpad to the wireless pads on resume if a reset happened. A nice side effect of this change is the LED ring on wireless controllers is now set correctly on system resume. Cc: stable@vger.kernel.org Fixes: 4220f7db1e42 ("Input: xpad - workaround dead irq_out after suspend/ resume") Signed-off-by: Cameron Gutman Signed-off-by: Pavel Rojtberg Link: https://lore.kernel.org/r/20220818154411.510308-3-rojtberg@gmail.com Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit dc3c3feff776fb4cee96614b05990c683d87650a Author: Pavel Rojtberg Date: Thu Aug 18 17:44:08 2022 +0200 Input: xpad - add supported devices as contributed on github commit b382c5e37344883dc97525d05f1f6b788f549985 upstream. This is based on multiple commits at https://github.com/paroj/xpad Cc: stable@vger.kernel.org Signed-off-by: Jasper Poppe Signed-off-by: Jeremy Palmer Signed-off-by: Ruineka Signed-off-by: Cleber de Mattos Casali Signed-off-by: Kyle Gospodnetich Signed-off-by: Pavel Rojtberg Link: https://lore.kernel.org/r/20220818154411.510308-2-rojtberg@gmail.com Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit 3c7c84319833259b0bb8c879928700c9e42d6562 Author: Jeremy Kerr Date: Wed Oct 12 10:08:51 2022 +0800 mctp: prevent double key removal and unref commit 3a732b46736cd8a29092e4b0b1a9ba83e672bf89 upstream. Currently, we have a bug where a simultaneous DROPTAG ioctl and socket close may race, as we attempt to remove a key from lists twice, and perform an unref for each removal operation. This may result in a uaf when we attempt the second unref. This change fixes the race by making __mctp_key_remove tolerant to being called on a key that has already been removed from the socket/net lists, and only performs the unref when we do the actual remove. We also need to hold the list lock on the ioctl cleanup path. This fix is based on a bug report and comprehensive analysis from butt3rflyh4ck , found via syzkaller. Cc: stable@vger.kernel.org Fixes: 63ed1aab3d40 ("mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control") Reported-by: butt3rflyh4ck Signed-off-by: Jeremy Kerr Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit dbd8cc654b5bb03ee6c06e2a3cb1bac981a675ad Author: Johannes Berg Date: Wed Oct 5 23:11:43 2022 +0200 wifi: cfg80211: update hidden BSSes to avoid WARN_ON commit c90b93b5b782891ebfda49d4e5da36632fefd5d1 upstream. When updating beacon elements in a non-transmitted BSS, also update the hidden sub-entries to the same beacon elements, so that a future update through other paths won't trigger a WARN_ON(). The warning is triggered because the beacon elements in the hidden BSSes that are children of the BSS should always be the same as in the parent. Reported-by: Sönke Huster Tested-by: Sönke Huster Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 8ed62f2df8ebcf79c185f1bc3e4f346ea0905da6 Author: Johannes Berg Date: Wed Oct 5 21:24:10 2022 +0200 wifi: mac80211: fix crash in beacon protection for P2P-device commit b2d03cabe2b2e150ff5a381731ea0355459be09f upstream. If beacon protection is active but the beacon cannot be decrypted or is otherwise malformed, we call the cfg80211 API to report this to userspace, but that uses a netdev pointer, which isn't present for P2P-Device. Fix this to call it only conditionally to ensure cfg80211 won't crash in the case of P2P-Device. This fixes CVE-2022-42722. Reported-by: Sönke Huster Fixes: 9eaf183af741 ("mac80211: Report beacon protection failures to user space") Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit d484f564f49dc7e302f85c9cbc90e72e585e926d Author: Johannes Berg Date: Wed Oct 5 15:10:09 2022 +0200 wifi: mac80211_hwsim: avoid mac80211 warning on bad rate commit 1833b6f46d7e2830251a063935ab464256defe22 upstream. If the tool on the other side (e.g. wmediumd) gets confused about the rate, we hit a warning in mac80211. Silence that by effectively duplicating the check here and dropping the frame silently (in mac80211 it's dropped with the warning). Reported-by: Sönke Huster Tested-by: Sönke Huster Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 377cb1ce85878c197904ca8383e6b41886e3994d Author: Johannes Berg Date: Sat Oct 1 00:01:44 2022 +0200 wifi: cfg80211: avoid nontransmitted BSS list corruption commit bcca852027e5878aec911a347407ecc88d6fff7f upstream. If a non-transmitted BSS shares enough information (both SSID and BSSID!) with another non-transmitted BSS of a different AP, then we can find and update it, and then try to add it to the non-transmitted BSS list. We do a search for it on the transmitted BSS, but if it's not there (but belongs to another transmitted BSS), the list gets corrupted. Since this is an erroneous situation, simply fail the list insertion in this case and free the non-transmitted BSS. This fixes CVE-2022-42721. Reported-by: Sönke Huster Tested-by: Sönke Huster Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit e97a5d7091e6d2df05f8378a518a9bbf81688b77 Author: Johannes Berg Date: Fri Sep 30 23:44:23 2022 +0200 wifi: cfg80211: fix BSS refcounting bugs commit 0b7808818cb9df6680f98996b8e9a439fa7bcc2f upstream. There are multiple refcounting bugs related to multi-BSSID: - In bss_ref_get(), if the BSS has a hidden_beacon_bss, then the bss pointer is overwritten before checking for the transmitted BSS, which is clearly wrong. Fix this by using the bss_from_pub() macro. - In cfg80211_bss_update() we copy the transmitted_bss pointer from tmp into new, but then if we release new, we'll unref it erroneously. We already set the pointer and ref it, but need to NULL it since it was copied from the tmp data. - In cfg80211_inform_single_bss_data(), if adding to the non- transmitted list fails, we unlink the BSS and yet still we return it, but this results in returning an entry without a reference. We shouldn't return it anyway if it was broken enough to not get added there. This fixes CVE-2022-42720. Reported-by: Sönke Huster Tested-by: Sönke Huster Fixes: a3584f56de1c ("cfg80211: Properly track transmitting and non-transmitting BSS") Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 8820e70f0ad84f6443ff5ad0f9b463a620116579 Author: Johannes Berg Date: Thu Sep 29 21:50:44 2022 +0200 wifi: cfg80211: ensure length byte is present before access commit 567e14e39e8f8c6997a1378bc3be615afca86063 upstream. When iterating the elements here, ensure the length byte is present before checking it to see if the entire element will fit into the buffer. Longer term, we should rewrite this code using the type-safe element iteration macros that check all of this. Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Reported-by: Soenke Huster Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 4afcb8886800131f8dd58d82754ee0c508303d46 Author: Johannes Berg Date: Wed Sep 28 22:07:15 2022 +0200 wifi: mac80211: fix MBSSID parsing use-after-free commit ff05d4b45dd89b922578dac497dcabf57cf771c6 upstream. When we parse a multi-BSSID element, we might point some element pointers into the allocated nontransmitted_profile. However, we free this before returning, causing UAF when the relevant pointers in the parsed elements are accessed. Fix this by not allocating the scratch buffer separately but as part of the returned structure instead, that way, there are no lifetime issues with it. The scratch buffer introduction as part of the returned data here is taken from MLO feature work done by Ilan. This fixes CVE-2022-42719. Fixes: 5023b14cf4df ("mac80211: support profile split between elements") Co-developed-by: Ilan Peer Signed-off-by: Ilan Peer Reviewed-by: Kees Cook Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 4609a23ce88b5b1bde2a03a96c03df19fac18e7b Author: Johannes Berg Date: Wed Sep 28 22:01:37 2022 +0200 wifi: cfg80211/mac80211: reject bad MBSSID elements commit 8f033d2becc24aa6bfd2a5c104407963560caabc upstream. Per spec, the maximum value for the MaxBSSID ('n') indicator is 8, and the minimum is 1 since a multiple BSSID set with just one BSSID doesn't make sense (the # of BSSIDs is limited by 2^n). Limit this in the parsing in both cfg80211 and mac80211, rejecting any elements with an invalid value. This fixes potentially bad shifts in the processing of these inside the cfg80211_gen_new_bssid() function later. I found this during the investigation of CVE-2022-41674 fixed by the previous patch. Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Fixes: 78ac51f81532 ("mac80211: support multi-bssid") Reviewed-by: Kees Cook Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit fc1ed6d0c9898a68da7f1f7843560dfda57683e2 Author: Johannes Berg Date: Wed Sep 28 21:56:15 2022 +0200 wifi: cfg80211: fix u8 overflow in cfg80211_update_notlisted_nontrans() commit aebe9f4639b13a1f4e9a6b42cdd2e38c617b442d upstream. In the copy code of the elements, we do the following calculation to reach the end of the MBSSID element: /* copy the IEs after MBSSID */ cpy_len = mbssid[1] + 2; This looks fine, however, cpy_len is a u8, the same as mbssid[1], so the addition of two can overflow. In this case the subsequent memcpy() will overflow the allocated buffer, since it copies 256 bytes too much due to the way the allocation and memcpy() sizes are calculated. Fix this by using size_t for the cpy_len variable. This fixes CVE-2022-41674. Reported-by: Soenke Huster Tested-by: Soenke Huster Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Reviewed-by: Kees Cook Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit a232bc42e1b746fef4a821459dc44643f16bad51 Author: Jason A. Donenfeld Date: Thu Sep 22 18:46:04 2022 +0200 random: use expired timer rather than wq for mixing fast pool commit 748bc4dd9e663f23448d8ad7e58c011a67ea1eca upstream. Previously, the fast pool was dumped into the main pool periodically in the fast pool's hard IRQ handler. This worked fine and there weren't problems with it, until RT came around. Since RT converts spinlocks into sleeping locks, problems cropped up. Rather than switching to raw spinlocks, the RT developers preferred we make the transformation from originally doing: do_some_stuff() spin_lock() do_some_other_stuff() spin_unlock() to doing: do_some_stuff() queue_work_on(some_other_stuff_worker) This is an ordinary pattern done all over the kernel. However, Sherry noticed a 10% performance regression in qperf TCP over a 40gbps InfiniBand card. Quoting her message: > MT27500 Family [ConnectX-3] cards: > Infiniband device 'mlx4_0' port 1 status: > default gid: fe80:0000:0000:0000:0010:e000:0178:9eb1 > base lid: 0x6 > sm lid: 0x1 > state: 4: ACTIVE > phys state: 5: LinkUp > rate: 40 Gb/sec (4X QDR) > link_layer: InfiniBand > > Cards are configured with IP addresses on private subnet for IPoIB > performance testing. > Regression identified in this bug is in TCP latency in this stack as reported > by qperf tcp_lat metric: > > We have one system listen as a qperf server: > [root@yourQperfServer ~]# qperf > > Have the other system connect to qperf server as a client (in this > case, it’s X7 server with Mellanox card): > [root@yourQperfClient ~]# numactl -m0 -N0 qperf 20.20.20.101 -v -uu -ub --time 60 --wait_server 20 -oo msg_size:4K:1024K:*2 tcp_lat Rather than incur the scheduling latency from queue_work_on, we can instead switch to running on the next timer tick, on the same core. This also batches things a bit more -- once per jiffy -- which is okay now that mix_interrupt_randomness() can credit multiple bits at once. Reported-by: Sherry Yang Tested-by: Paul Webb Cc: Sherry Yang Cc: Phillip Goerl Cc: Jack Vogel Cc: Nicky Veitch Cc: Colm Harrington Cc: Ramanan Govindarajan Cc: Sebastian Andrzej Siewior Cc: Dominik Brodowski Cc: Tejun Heo Cc: Sultan Alsawaf Cc: stable@vger.kernel.org Fixes: 58340f8e952b ("random: defer fast pool mixing to worker") Signed-off-by: Jason A. Donenfeld Signed-off-by: Greg Kroah-Hartman commit f0b13483ee942b76350afeddd8131285cf4d3de2 Author: Jason A. Donenfeld Date: Thu Sep 22 18:46:04 2022 +0200 random: avoid reading two cache lines on irq randomness commit 9ee0507e896b45af6d65408c77815800bce30008 upstream. In order to avoid reading and dirtying two cache lines on every IRQ, move the work_struct to the bottom of the fast_pool struct. add_ interrupt_randomness() always touches .pool and .count, which are currently split, because .mix pushes everything down. Instead, move .mix to the bottom, so that .pool and .count are always in the first cache line, since .mix is only accessed when the pool is full. Fixes: 58340f8e952b ("random: defer fast pool mixing to worker") Reviewed-by: Sebastian Andrzej Siewior Signed-off-by: Jason A. Donenfeld Signed-off-by: Greg Kroah-Hartman commit 5bb860b8d4fcaebcde22a745e6714adfddf58773 Author: Giovanni Cabiddu Date: Fri Sep 9 11:49:13 2022 +0100 Revert "crypto: qat - reduce size of mapped region" commit 9c5f21b198d259bfe1191b1fedf08e2eab15b33b upstream. This reverts commit e48767c17718067ba21fb2ef461779ec2506f845. In an attempt to resolve a set of warnings reported by the static analyzer Smatch, the reverted commit improperly reduced the sizes of the DMA mappings used for the input and output parameters for both RSA and DH creating a mismatch (map size=8 bytes, unmap size=64 bytes). This issue is reported when CONFIG_DMA_API_DEBUG is selected, when the crypto self test is run. The function dma_unmap_single() reports a warning similar to the one below, saying that the `device driver frees DMA memory with different size`. DMA-API: 4xxx 0000:06:00.0: device driver frees DMA memory with different size [device address=0x0000000123206c80] [map size=8 bytes] [unmap size=64 bytes] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:973 check_unmap+0x3d0/0x8c0\ ... Call Trace: debug_dma_unmap_page+0x5c/0x60 qat_dh_cb+0xd7/0x110 [intel_qat] qat_alg_asym_callback+0x1a/0x30 [intel_qat] adf_response_handler+0xbd/0x1a0 [intel_qat] tasklet_action_common.constprop.0+0xcd/0xe0 __do_softirq+0xf8/0x30c __irq_exit_rcu+0xbf/0x140 common_interrupt+0xb9/0xd0 The original commit was correct. Cc: Reported-by: Herbert Xu Signed-off-by: Giovanni Cabiddu Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit e471bc03f89ea81bebc1d8dc90de7cb0a8ec8e19 Author: Nathan Lynch Date: Wed Sep 7 17:01:11 2022 -0500 Revert "powerpc/rtas: Implement reentrant rtas call" commit f88aabad33ea22be2ce1c60d8901942e4e2a9edb upstream. At the time this was submitted by Leonardo, I confirmed -- or thought I had confirmed -- with PowerVM partition firmware development that the following RTAS functions: - ibm,get-xive - ibm,int-off - ibm,int-on - ibm,set-xive were safe to call on multiple CPUs simultaneously, not only with respect to themselves as indicated by PAPR, but with arbitrary other RTAS calls: https://lore.kernel.org/linuxppc-dev/875zcy2v8o.fsf@linux.ibm.com/ Recent discussion with firmware development makes it clear that this is not true, and that the code in commit b664db8e3f97 ("powerpc/rtas: Implement reentrant rtas call") is unsafe, likely explaining several strange bugs we've seen in internal testing involving DLPAR and LPM. These scenarios use ibm,configure-connector, whose internal state can be corrupted by the concurrent use of the "reentrant" functions, leading to symptoms like endless busy statuses from RTAS. Fixes: b664db8e3f97 ("powerpc/rtas: Implement reentrant rtas call") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Nathan Lynch Reviewed-by: Laurent Dufour Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20220907220111.223267-1-nathanl@linux.ibm.com Signed-off-by: Greg Kroah-Hartman commit 53c2e5d5b5ca92b06c02b5461f555181d56484be Author: Andy Shevchenko Date: Tue Sep 27 18:53:32 2022 +0300 Revert "usb: dwc3: Don't switch OTG -> peripheral if extcon is present" commit 7a84e7353e23202d4f82b05093af4db2b26e6768 upstream. This reverts commit 0f01017191384e3962fa31520a9fd9846c3d352f. As pointed out by Ferry this breaks Dual Role support on Intel Merrifield platforms. Fixes: 0f0101719138 ("usb: dwc3: Don't switch OTG -> peripheral if extcon is present") Reported-by: Ferry Toth Cc: stable@vger.kernel.org Signed-off-by: Andy Shevchenko Tested-by: Ferry Toth # for Merrifield Reviewed-by: Sven Peter Link: https://lore.kernel.org/r/20220927155332.10762-3-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 26e0a333a84be981e8ff079740d9e31c8747e9dd Author: Andy Shevchenko Date: Tue Sep 27 18:53:31 2022 +0300 Revert "USB: fixup for merge issue with "usb: dwc3: Don't switch OTG -> peripheral if extcon is present"" commit 2adc960ce79d3231b02f820daeee434542fe2911 upstream. This reverts commit 8bd6b8c4b1009d7d2662138d6bdc6fe58a9274c5. Prerequisite revert for the reverting of the original commit 0f0101719138. Fixes: 8bd6b8c4b100 ("USB: fixup for merge issue with "usb: dwc3: Don't switch OTG -> peripheral if extcon is present"") Fixes: 0f0101719138 ("usb: dwc3: Don't switch OTG -> peripheral if extcon is present") Reported-by: Ferry Toth Cc: stable@vger.kernel.org Signed-off-by: Andy Shevchenko Tested-by: Ferry Toth # for Merrifield Link: https://lore.kernel.org/r/20220927155332.10762-2-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 59f29f77c9b6b151abf25adee48da94d4dd9d104 Author: Frank Wunderlich Date: Mon Sep 26 17:07:39 2022 +0200 USB: serial: qcserial: add new usb-id for Dell branded EM7455 commit eee48781ea199e32c1d0c4732641c494833788ca upstream. Add support for Dell 5811e (EM7455) with USB-id 0x413c:0x81c2. Signed-off-by: Frank Wunderlich Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit b9b7369d89924a366b20045dc26dc4dc6b0567a4 Author: Linus Torvalds Date: Fri Sep 9 08:54:47 2022 +0200 scsi: stex: Properly zero out the passthrough command structure commit 6022f210461fef67e6e676fd8544ca02d1bcfa7a upstream. The passthrough structure is declared off of the stack, so it needs to be set to zero before copied back to userspace to prevent any unintentional data leakage. Switch things to be statically allocated which will fill the unused fields with 0 automatically. Link: https://lore.kernel.org/r/YxrjN3OOw2HHl9tx@kroah.com Cc: stable@kernel.org Cc: "James E.J. Bottomley" Cc: "Martin K. Petersen" Cc: Dan Carpenter Reported-by: hdthky Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit f22520a2136ad12b228c0b33435732e0899589b1 Author: Arun Easi Date: Fri Aug 26 03:25:54 2022 -0700 scsi: qla2xxx: Fix response queue handler reading stale packets commit e4f8a29deb3ba30e414dfb6b09e3ae3bf6dbe74a upstream. On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer. Link: https://lore.kernel.org/r/20220826102559.17474-3-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani Signed-off-by: Arun Easi Signed-off-by: Nilesh Javali Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit add6d15e3d02b647165fc81e5aced4b3c911133b Author: Arun Easi Date: Fri Aug 26 03:25:53 2022 -0700 scsi: qla2xxx: Revert "scsi: qla2xxx: Fix response queue handler reading stale packets" commit 6dc45a7322cb9db48a5b6696597a00ef7c778ef9 upstream. Reverting this commit so that a fixed up patch, without adding new module parameters, can be submitted. Link: https://lore.kernel.org/stable/166039743723771@kroah.com/ This reverts commit b1f707146923335849fb70237eec27d4d1ae7d62. Link: https://lore.kernel.org/r/20220826102559.17474-2-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani Signed-off-by: Arun Easi Signed-off-by: Nilesh Javali Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 16c0b849ebd4d89b0d1b4802ff74fb533f9cf0e3 Author: Orlando Chamberlain Date: Thu Sep 29 11:49:56 2022 +0000 efi: Correct Macmini DMI match in uefi cert quirk commit bab715bdaa9ebf28d99a6d1efb2704a30125e96d upstream. It turns out Apple doesn't capitalise the "mini" in "Macmini" in DMI, which is inconsistent with other model line names. Correct the capitalisation of Macmini in the quirk for skipping loading platform certs on T2 Macs. Currently users get: ------------[ cut here ]------------ [Firmware Bug]: Page fault caused by firmware at PA: 0xffffa30640054000 WARNING: CPU: 1 PID: 8 at arch/x86/platform/efi/quirks.c:735 efi_crash_gracefully_on_page_fault+0x55/0xe0 Modules linked in: CPU: 1 PID: 8 Comm: kworker/u12:0 Not tainted 5.18.14-arch1-2-t2 #1 4535eb3fc40fd08edab32a509fbf4c9bc52d111e Hardware name: Apple Inc. Macmini8,1/Mac-7BA5B2DFE22DDD8C, BIOS 1731.120.10.0.0 (iBridge: 19.16.15071.0.0,0) 04/24/2022 Workqueue: efi_rts_wq efi_call_rts ... ---[ end trace 0000000000000000 ]--- efi: Froze efi_rts_wq and disabled EFI Runtime Services integrity: Couldn't get size: 0x8000000000000015 integrity: MODSIGN: Couldn't get UEFI db list efi: EFI Runtime Services are disabled! integrity: Couldn't get size: 0x8000000000000015 integrity: Couldn't get UEFI dbx list Fixes: 155ca952c7ca ("efi: Do not import certificates from UEFI Secure Boot for T2 Macs") Cc: stable@vger.kernel.org Cc: Aditya Garg Tested-by: Samuel Jiang Signed-off-by: Orlando Chamberlain Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman commit c5b4ed9bec58ecb21d88e2ae6f9bb55661e10ec6 Author: Takashi Iwai Date: Thu Sep 29 08:14:55 2022 +0200 ALSA: hda/realtek: Add quirk for HP Zbook Firefly 14 G9 model commit 225f6e1bc151978041595c7d2acaded3aac41f54 upstream. HP Zbook Firefly 14 G9 model (103c:8abb) requires yet another binding with CS35L41 codec, but with a slightly different configuration. It's over spi1 instead of spi0. Create a new fixup entry for that. Cc: Link: https://lore.kernel.org/r/20220929061455.13355-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 0aba8959aa922c12c6df86785d8c12707face3d6 Author: Takashi Iwai Date: Sat Oct 1 16:21:24 2022 +0200 ALSA: hda: Fix position reporting on Poulsbo commit 56e696c0f0c71b77fff921fc94b58a02f0445b2c upstream. Hans reported that his Sony VAIO VPX11S1E showed the broken sound behavior at the start of the stream for a couple of seconds, and it turned out that the position_fix=1 option fixes the issue. It implies that the position reporting is inaccurate, and very likely hitting on all Poulsbo devices. The patch applies the workaround for Poulsbo generically to switch to LPIB mode instead of the default position buffer. Reported-and-tested-by: Hans de Goede Cc: Link: https://lore.kernel.org/r/3e8697e1-87c6-7a7b-d2e8-b21f1d2f181b@redhat.com Link: https://lore.kernel.org/r/20221001142124.7241-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit f4f5b6cf3e1db05560a1f0d157d695401dd16804 Author: Jason A. Donenfeld Date: Fri Sep 23 02:42:51 2022 +0200 random: clamp credited irq bits to maximum mixed commit e78a802a7b4febf53f2a92842f494b01062d85a8 upstream. Since the most that's mixed into the pool is sizeof(long)*2, don't credit more than that many bytes of entropy. Fixes: e3e33fc2ea7f ("random: do not use input pool from hard IRQs") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: Greg Kroah-Hartman commit a3dbb621eed976d4427a1ddd88967cc48d930987 Author: Jason A. Donenfeld Date: Thu Sep 8 16:14:00 2022 +0200 random: restore O_NONBLOCK support commit cd4f24ae9404fd31fc461066e57889be3b68641b upstream. Prior to 5.6, when /dev/random was opened with O_NONBLOCK, it would return -EAGAIN if there was no entropy. When the pools were unified in 5.6, this was lost. The post 5.6 behavior of blocking until the pool is initialized, and ignoring O_NONBLOCK in the process, went unnoticed, with no reports about the regression received for two and a half years. However, eventually this indeed did break somebody's userspace. So we restore the old behavior, by returning -EAGAIN if the pool is not initialized. Unlike the old /dev/random, this can only occur during early boot, after which it never blocks again. In order to make this O_NONBLOCK behavior consistent with other expectations, also respect users reading with preadv2(RWF_NOWAIT) and similar. Fixes: 30c08efec888 ("random: make /dev/random be almost like /dev/urandom") Reported-by: Guozihua Reported-by: Zhongguohua Cc: Al Viro Cc: Theodore Ts'o Cc: Andrew Lutomirski Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld Signed-off-by: Greg Kroah-Hartman commit 7d70af8676a7c6cd8c3a2022e33530e14bd0cd7b Author: Rishabh Bhatnagar Date: Tue Sep 20 19:19:32 2022 +0000 nvme-pci: set min_align_mask before calculating max_hw_sectors commit 61ce339f19fabbc3e51237148a7ef6f2270e44fa upstream. If swiotlb is force enabled dma_max_mapping_size ends up calling swiotlb_max_mapping_size which takes into account the min align mask for the device. Set the min align mask for nvme driver before calling dma_max_mapping_size while calculating max hw sectors. Signed-off-by: Rishabh Bhatnagar Signed-off-by: Christoph Hellwig Signed-off-by: Greg Kroah-Hartman commit 5c0776b5bc31de7cd28afb558fae37a20f33602e Author: Ryusuke Konishi Date: Thu Sep 29 21:33:30 2022 +0900 nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure commit 723ac751208f6d6540191689cfbf6c77135a7a1b upstream. If creation or finalization of a checkpoint fails due to anomalies in the checkpoint metadata on disk, a kernel warning is generated. This patch replaces the WARN_ONs by nilfs_error, so that a kernel, booted with panic_on_warn, does not panic. A nilfs_error is appropriate here to handle the abnormal filesystem condition. This also replaces the detected error codes with an I/O error so that neither of the internal error codes is returned to callers. Link: https://lkml.kernel.org/r/20220929123330.19658-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi Reported-by: syzbot+fbb3e0b24e8dae5a16ee@syzkaller.appspotmail.com Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 9dc48a360e7b6bb16c48625f8f80ab7665bc9648 Author: Ryusuke Konishi Date: Fri Oct 7 17:52:26 2022 +0900 nilfs2: fix leak of nilfs_root in case of writer thread creation failure commit d0d51a97063db4704a5ef6bc978dddab1636a306 upstream. If nilfs_attach_log_writer() failed to create a log writer thread, it frees a data structure of the log writer without any cleanup. After commit e912a5b66837 ("nilfs2: use root object to get ifile"), this causes a leak of struct nilfs_root, which started to leak an ifile metadata inode and a kobject on that struct. In addition, if the kernel is booted with panic_on_warn, the above ifile metadata inode leak will cause the following panic when the nilfs2 kernel module is removed: kmem_cache_destroy nilfs2_inode_cache: Slab cache still has objects when called from nilfs_destroy_cachep+0x16/0x3a [nilfs2] WARNING: CPU: 8 PID: 1464 at mm/slab_common.c:494 kmem_cache_destroy+0x138/0x140 ... RIP: 0010:kmem_cache_destroy+0x138/0x140 Code: 00 20 00 00 e8 a9 55 d8 ff e9 76 ff ff ff 48 8b 53 60 48 c7 c6 20 70 65 86 48 c7 c7 d8 69 9c 86 48 8b 4c 24 28 e8 ef 71 c7 00 <0f> 0b e9 53 ff ff ff c3 48 81 ff ff 0f 00 00 77 03 31 c0 c3 53 48 ... Call Trace: ? nilfs_palloc_freev.cold.24+0x58/0x58 [nilfs2] nilfs_destroy_cachep+0x16/0x3a [nilfs2] exit_nilfs_fs+0xa/0x1b [nilfs2] __x64_sys_delete_module+0x1d9/0x3a0 ? __sanitizer_cov_trace_pc+0x1a/0x50 ? syscall_trace_enter.isra.19+0x119/0x190 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd ... Kernel panic - not syncing: panic_on_warn set ... This patch fixes these issues by calling nilfs_detach_log_writer() cleanup function if spawning the log writer thread fails. Link: https://lkml.kernel.org/r/20221007085226.57667-1-konishi.ryusuke@gmail.com Fixes: e912a5b66837 ("nilfs2: use root object to get ifile") Signed-off-by: Ryusuke Konishi Reported-by: syzbot+7381dc4ad60658ca4c05@syzkaller.appspotmail.com Tested-by: Ryusuke Konishi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 6251c9c0430d70cc221d0bb907b278bd99d7b066 Author: Ryusuke Konishi Date: Tue Oct 4 00:05:19 2022 +0900 nilfs2: fix use-after-free bug of struct nilfs_root commit d325dc6eb763c10f591c239550b8c7e5466a5d09 upstream. If the beginning of the inode bitmap area is corrupted on disk, an inode with the same inode number as the root inode can be allocated and fail soon after. In this case, the subsequent call to nilfs_clear_inode() on that bogus root inode will wrongly decrement the reference counter of struct nilfs_root, and this will erroneously free struct nilfs_root, causing kernel oopses. This fixes the problem by changing nilfs_new_inode() to skip reserved inode numbers while repairing the inode bitmap. Link: https://lkml.kernel.org/r/20221003150519.39789-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi Reported-by: syzbot+b8c672b0e22615c80fe0@syzkaller.appspotmail.com Reported-by: Khalid Masum Tested-by: Ryusuke Konishi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 037e760a4a009e9545a51e87c98c22d9aaf32df7 Author: Ryusuke Konishi Date: Sun Oct 2 12:08:04 2022 +0900 nilfs2: fix NULL pointer dereference at nilfs_bmap_lookup_at_level() commit 21a87d88c2253350e115029f14fe2a10a7e6c856 upstream. If the i_mode field in inode of metadata files is corrupted on disk, it can cause the initialization of bmap structure, which should have been called from nilfs_read_inode_common(), not to be called. This causes a lockdep warning followed by a NULL pointer dereference at nilfs_bmap_lookup_at_level(). This patch fixes these issues by adding a missing sanitiy check for the i_mode field of metadata file's inode. Link: https://lkml.kernel.org/r/20221002030804.29978-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi Reported-by: syzbot+2b32eb36c1a825b7a74c@syzkaller.appspotmail.com Reported-by: Tetsuo Handa Tested-by: Ryusuke Konishi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 16c33ae7f3d52020152dd01890a6e7c924a7485e Author: Sebastian Andrzej Siewior Date: Tue Oct 4 16:58:44 2022 +0200 v6.0-rt11 Signed-off-by: Sebastian Andrzej Siewior commit 7fada86ad7e23ae36c7b77387921735289080c23 Merge: 9cd1036de86a 4fe89d07dcc2 Author: Sebastian Andrzej Siewior Date: Tue Oct 4 16:58:31 2022 +0200 Merge tag 'v6.0' into linux-6.0.y-rt Linux 6.0 Signed-off-by: Sebastian Andrzej Siewior commit 9cd1036de86a91cf6c29ac56a9c5bbad9719f8c9 Author: Sebastian Andrzej Siewior Date: Mon Sep 26 18:10:56 2022 +0200 v6.0-rc7-rt10 Signed-off-by: Sebastian Andrzej Siewior commit 22b80994e85578c02612c990fa39a230275f64f8 Merge: f85c74e09dce f76349cf4145 Author: Sebastian Andrzej Siewior Date: Mon Sep 26 18:10:04 2022 +0200 Merge tag 'v6.0-rc7' into linux-6.0.y-rt Linux 6.0-rc7 Signed-off-by: Sebastian Andrzej Siewior commit f85c74e09dcec83cc5271cd9742d098c82030369 Author: Sebastian Andrzej Siewior Date: Mon Sep 19 13:02:44 2022 +0200 v6.0-rc6-rt9 Signed-off-by: Sebastian Andrzej Siewior commit 10b19b000766620dbc80b72603bcd9fb69e6e0a2 Author: Sebastian Andrzej Siewior Date: Mon Sep 19 13:01:00 2022 +0200 Revert "checkpatch: Print an error if rwlock.h is included directly." This reverts commit b9f266031d5d9344f82b1cc4daebc2fcff3dd7fc. It has been superseded by commit de296aa684396 ("locking: Detect includes rwlock.h outside of spinlock.h") Signed-off-by: Sebastian Andrzej Siewior commit 3e79fca071bcf4b9783e9b67de79344dcb32ffb0 Author: Sebastian Andrzej Siewior Date: Mon Sep 19 12:49:14 2022 +0200 v6.0-rc6-rt8 Signed-off-by: Sebastian Andrzej Siewior commit d380eee881ef03c897104f9113ca444fd6761d18 Merge: d53cc07b3640 521a547ced64 Author: Sebastian Andrzej Siewior Date: Mon Sep 19 12:48:54 2022 +0200 Merge tag 'v6.0-rc6' into linux-6.0.y-rt Linux 6.0-rc6 Signed-off-by: Sebastian Andrzej Siewior commit d53cc07b3640b5325b5e128f9a3102051579dd73 Author: Sebastian Andrzej Siewior Date: Tue Sep 13 13:43:49 2022 +0200 v6.0-rc5-rt7 Signed-off-by: Sebastian Andrzej Siewior commit 2f4f29a5be35bb0dcd4dead3ab402fb29495feaf Merge: 193531650b56 80e78fcce86d Author: Sebastian Andrzej Siewior Date: Tue Sep 13 13:43:09 2022 +0200 Merge tag 'v6.0-rc5' into linux-6.0.y-rt Linux 6.0-rc5 Signed-off-by: Sebastian Andrzej Siewior commit 193531650b5693d74c26abe552834a2f45311ad3 Author: Sebastian Andrzej Siewior Date: Mon Sep 5 08:56:49 2022 +0200 v6.0-rc4-rt6 Signed-off-by: Sebastian Andrzej Siewior commit 5835bb5903ccc65604b57b0072f50cd9cee69ef1 Author: Sebastian Andrzej Siewior Date: Mon Sep 5 08:55:23 2022 +0200 net: Remove the obsolte u64_stats_fetch_*_irq() users Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. [bigeasy: needed since the introduction of them just got merged.] Signed-off-by: Sebastian Andrzej Siewior commit b3954eb50a451385cde177619c2e0094ed1e760f Merge: b20cff55cf5d 7e18e42e4b28 Author: Sebastian Andrzej Siewior Date: Mon Sep 5 08:53:32 2022 +0200 Merge tag 'v6.0-rc4' into linux-6.0.y-rt Linux 6.0-rc4 Signed-off-by: Sebastian Andrzej Siewior commit b20cff55cf5dedc58a3aa5af49e7a7ce26ee9b97 Author: Sebastian Andrzej Siewior Date: Mon Aug 29 08:31:54 2022 +0200 v6.0-rc3-rt5 Signed-off-by: Sebastian Andrzej Siewior commit 93669589ef517eac8f348f948e9dfdbd5e689e6d Merge: 32766b841905 b90cb1053190 Author: Sebastian Andrzej Siewior Date: Mon Aug 29 08:31:34 2022 +0200 Merge tag 'v6.0-rc3' into linux-6.0.y-rt Linux 6.0-rc3 Signed-off-by: Sebastian Andrzej Siewior commit 32766b841905a0afd97a555b825d2d768184e391 Author: Sebastian Andrzej Siewior Date: Fri Aug 26 16:50:43 2022 +0200 v6.0-rc2-rt4 Signed-off-by: Sebastian Andrzej Siewior commit b6dad732c3de5371f6d26fe47847275f9be54dab Author: Sebastian Andrzej Siewior Date: Fri Aug 26 16:50:32 2022 +0200 mm/slub: fix validation races and cleanup locking Add Vlastimil's series as of https://lkml.kernel.org/r/20220823170400.26546-1-vbabka@suse.cz + tglx's "Make PREEMPT_RT support less convoluted" in an updated version which is also part of the thread as 6/5. Signed-off-by: Sebastian Andrzej Siewior commit 89d4985cfc290546d8b894fb26aef7d260d52245 Author: Sebastian Andrzej Siewior Date: Fri Aug 26 16:50:18 2022 +0200 softirq: Redorder the code slightly. Reorder the code slightly so that sparse does not complain about `timersd' being used without a declaration in the !RT case. The compiler removes the variable because it is unused due to !IS_ENABLED but sparse doesn't care. Reported-by: kernel test robot Signed-off-by: Sebastian Andrzej Siewior commit f41315952bc0609766fca3dd69bc61e9a4c24295 Author: Sebastian Andrzej Siewior Date: Fri Aug 26 16:49:01 2022 +0200 Update the "Replace PREEMPT_RT ifdefs with preempt_[dis|en]able_nested()." series. Update to v2 as of https://lore.kernel.org/all/20220825164131.402717-1-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior commit de296aa68439680ff8eaedad225a722158f331c4 Author: Michael S. Tsirkin Date: Thu Aug 25 17:30:49 2022 +0200 locking: Detect includes rwlock.h outside of spinlock.h The check for __LINUX_SPINLOCK_H within rwlock.h (and other files) detects the direct include of the header file if it is at the very beginning of the include section. If it is listed later then chances are high that spinlock.h was already included (including rwlock.h) and the additional listing of rwlock.h will not cause any failure. On PREEMPT_RT this additional rwlock.h will lead to compile failures since it uses a different rwlock implementation. Add __LINUX_INSIDE_SPINLOCK_H to spinlock.h and check for this instead of __LINUX_SPINLOCK_H to detect wrong includes. This will help detect direct includes of rwlock.h with without PREEMPT_RT enabled. [ bigeasy: add remaining __LINUX_SPINLOCK_H user and rewrite commit description. ] Signed-off-by: Michael S. Tsirkin Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/YweemHxJx7O8rjBx@linutronix.de commit d45288af01608c359ee51e9345de77b3eaf5f4bb Author: Sebastian Andrzej Siewior Date: Wed Aug 24 11:42:18 2022 +0200 net: Use u64_stats_fetch_begin_irq() for stats fetch. On 32bit-UP u64_stats_fetch_begin() disables only preemption. If the reader is in preemptible context and the writer side (u64_stats_update_begin*()) runs in an interrupt context (IRQ or softirq) then the writer can update the stats during the read operation. This update remains undetected. Use u64_stats_fetch_begin_irq() to ensure the stats fetch on 32bit-UP are not interrupted by a writer. 32bit-SMP remains unaffected by this change. Cc: "David S. Miller" Cc: Catherine Sullivan Cc: David Awogbemila Cc: Dimitris Michailidis Cc: Eric Dumazet Cc: Hans Ulli Kroll Cc: Jakub Kicinski Cc: Jeroen de Borst Cc: Johannes Berg Cc: Linus Walleij Cc: Paolo Abeni Cc: Simon Horman Cc: linux-arm-kernel@lists.infradead.org Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Cc: oss-drivers@corigine.com Cc: stable@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/20220825113645.212996-3-bigeasy@linutronix.de commit 54d82b566756d4c01878892f176940d42156d0cb Author: Sebastian Andrzej Siewior Date: Tue Aug 23 17:40:18 2022 +0200 net: dsa: xrs700x: Use irqsave variant for u64 stats update xrs700x_read_port_counters() updates the stats from a worker using the u64_stats_update_begin() version. This is okay on 32-UP since on the reader side preemption is disabled. On 32bit-SMP the writer can be preempted by the reader at which point the reader will spin on the seqcount until writer continues and completes the update. Assigning the mib_mutex mutex to the underlying seqcount would ensure proper synchronisation. The API for that on the u64_stats_init() side isn't available. Since it is the only user, just use disable interrupts during the update. Use u64_stats_update_begin_irqsave() on the writer side to ensure an uninterrupted update. Fixes: ee00b24f32eb8 ("net: dsa: add Arrow SpeedChips XRS700x driver") Cc: Andrew Lunn Cc: Florian Fainelli Cc: George McCollister Cc: Vivien Didelot Cc: Vladimir Oltean Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/20220825113645.212996-2-bigeasy@linutronix.de commit d82772e0847d363b1a6568042170f75099af4133 Author: Sebastian Andrzej Siewior Date: Wed Aug 10 11:23:31 2022 +0200 asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig. Remove the CONFIG_PREEMPT_RT symbol from the ifdef around do_softirq_own_stack() and move it to Kconfig instead. Enable softirq stacks based on SOFTIRQ_ON_OWN_STACK which depends on HAVE_SOFTIRQ_ON_OWN_STACK and its default value is set to !PREEMPT_RT. This ensures that softirq stacks are not used on PREEMPT_RT and avoids a 'select' statement on an option which has a 'depends' statement. Link: https://lore.kernel.org/YvN5E%2FPrHfUhggr7@linutronix.de Signed-off-by: Sebastian Andrzej Siewior commit be55e3999509c4e8efa433f6f3e9038ca20327a2 Author: Sebastian Andrzej Siewior Date: Mon Aug 22 09:45:56 2022 +0200 v6.0-rc2-rt3 Signed-off-by: Sebastian Andrzej Siewior commit ca46477dc8b34d96c0189db3fc535c8ba4504891 Merge: 2ba05b4f50a6 1c23f9e627a7 Author: Sebastian Andrzej Siewior Date: Mon Aug 22 09:44:20 2022 +0200 Merge tag 'v6.0-rc2' into linux-6.0.y-rt Linux 6.0-rc2 Signed-off-by: Sebastian Andrzej Siewior commit 2ba05b4f50a6a60bef6d2e8b791482feeaf200dd Author: Sebastian Andrzej Siewior Date: Fri Aug 19 17:54:26 2022 +0200 v6.0-rc1-rt2 Signed-off-by: Sebastian Andrzej Siewior commit b4b2c2c7070ae0a117daee0b0b8ed3706222e693 Author: Sebastian Andrzej Siewior Date: Tue Aug 16 09:45:22 2022 +0200 vduse: Remove include of rwlock.h rwlock.h should not be included directly. Instead linux/splinlock.h should be included. Including it directly will break the RT build. Remove the rwlock.h include. Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/20220816074816.173227-1-bigeasy@linutronix.de commit b9f266031d5d9344f82b1cc4daebc2fcff3dd7fc Author: Sebastian Andrzej Siewior Date: Tue Aug 16 09:32:30 2022 +0200 checkpatch: Print an error if rwlock.h is included directly. rwlock.h shouldn't be included directly in source code. PREEMPT_RT uses a different implementation and this rwlock.h include breaks it. Add an error message if linux/rwlock.h is included. Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/20220816075118.173455-1-bigeasy@linutronix.de commit 4d213ee396a6ce37a8175a60caaa9ad7186681bf Author: Thomas Gleixner Date: Wed Aug 17 18:27:03 2022 +0200 u64_stat: Remove the obsolete fetch_irq() variants Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert all callers to the regular interface and delete the obsolete interfaces. Signed-off-by: Thomas Gleixner Cc: netdev@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-10-bigeasy@linutronix.de commit 513e7e07ef1c141d8879014584aea5b05108114d Author: Thomas Gleixner Date: Wed Aug 17 18:27:02 2022 +0200 u64_stats: Streamline the implementation The u64 stats code handles 3 different cases: - 32bit UP - 32bit SMP - 64bit with an unreadable #ifdef maze, which was recently expanded with PREEMPT_RT conditionals. Reduce it to two cases (32bit and 64bit) and drop the optimization for 32bit UP as suggested by Linus. Use the new preempt_disable/enable_nested() helpers to get rid of the CONFIG_PREEMPT_RT conditionals. Signed-off-by: Thomas Gleixner Cc: netdev@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-9-bigeasy@linutronix.de commit 037e2a4cced8fc00b0f5640a66e23c4d8099a8c6 Author: Thomas Gleixner Date: Wed Aug 17 18:27:01 2022 +0200 mm/compaction: Get rid of RT ifdeffery Move the RT dependency for the initial value of sysctl_compact_unevictable_allowed into Kconfig. Signed-off-by: Thomas Gleixner Cc: Andrew Morton Cc: Nick Terrell Cc: linux-mm@kvack.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-8-bigeasy@linutronix.de commit 07a1b9585015ad54ff8ee23a1a7cf0827d3191c2 Author: Thomas Gleixner Date: Wed Aug 17 18:27:00 2022 +0200 mm/memcontrol: Replace the PREEMPT_RT conditionals Use VM_WARN_ON_IRQS_ENABLED() and preempt_disable/enable_nested() to replace the CONFIG_PREEMPT_RT #ifdeffery. Signed-off-by: Thomas Gleixner Cc: Johannes Weiner Cc: Michal Hocko Cc: Roman Gushchin Cc: Shakeel Butt Cc: Muchun Song Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Johannes Weiner Reviewed-by: Muchun Song Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-7-bigeasy@linutronix.de commit 8cd519864735e13b294bb2d9136eb5366e593cf2 Author: Thomas Gleixner Date: Wed Aug 17 18:26:59 2022 +0200 mm/debug: Provide VM_WARN_ON_IRQS_ENABLED() Some places in the VM code expect interrupts disabled, which is a valid expectation on non-PREEMPT_RT kernels, but does not hold on RT kernels in some places because the RT spinlock substitution does not disable interrupts. To avoid sprinkling CONFIG_PREEMPT_RT conditionals into those places, provide VM_WARN_ON_IRQS_ENABLED() which is only enabled when VM_DEBUG=y and PREEMPT_RT=n. Signed-off-by: Thomas Gleixner Cc: Andrew Morton Cc: linux-mm@kvack.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-6-bigeasy@linutronix.de commit ff15cfcb145b4d63114c772a84c048388be51710 Author: Thomas Gleixner Date: Wed Aug 17 18:26:58 2022 +0200 mm/vmstat: Use preempt_[dis|en]able_nested() Replace the open coded CONFIG_PREEMPT_RT conditional preempt_enable/disable() pairs with the new helper functions which hide the underlying implementation details. Signed-off-by: Thomas Gleixner Cc: Andrew Morton Cc: linux-mm@kvack.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-5-bigeasy@linutronix.de commit 256713fc7562cd297b1a6ee27566787aa5fe2233 Author: Thomas Gleixner Date: Wed Aug 17 18:26:57 2022 +0200 dentry: Use preempt_[dis|en]able_nested() Replace the open coded CONFIG_PREEMPT_RT conditional preempt_disable/enable() with the new helper. Signed-off-by: Thomas Gleixner Cc: Alexander Viro Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-4-bigeasy@linutronix.de commit 15ad66a2b6f3e5282b49b8a52352cb4ed62d5a86 Author: Thomas Gleixner Date: Wed Aug 17 18:26:56 2022 +0200 preempt: Provide preempt_[dis|en]able_nested() On PREEMPT_RT enabled kernels, spinlocks and rwlocks are neither disabling preemption nor interrupts. Though there are a few places which depend on the implicit preemption/interrupt disable of those locks, e.g. seqcount write sections, per CPU statistics updates etc. To avoid sprinkling CONFIG_PREEMPT_RT conditionals all over the place, add preempt_disable_nested() and preempt_enable_nested() which should be descriptive enough. Add a lockdep assertion for the !PREEMPT_RT case to catch callers which do not have preemption disabled. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ingo Molnar Cc: Juri Lelli Cc: Mel Gorman Cc: Peter Zijlstra Cc: Steven Rostedt Cc: Valentin Schneider Cc: Vincent Guittot Suggested-by: Linus Torvalds Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-3-bigeasy@linutronix.de commit 9c858890fc5994eea7376780a1ca380877a1f4b6 Author: Thomas Gleixner Date: Wed Aug 17 18:26:55 2022 +0200 slub: Make PREEMPT_RT support less convoluted The slub code already has a few helpers depending on PREEMPT_RT. Add a few more and get rid of the CONFIG_PREEMPT_RT conditionals all over the place. No functional change. Signed-off-by: Thomas Gleixner Cc: Andrew Morton Cc: Christoph Lameter Cc: David Rientjes Cc: Joonsoo Kim Cc: Pekka Enberg Cc: Vlastimil Babka Cc: linux-mm@kvack.org Signed-off-by: Sebastian Andrzej Siewior Acked-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220817162703.728679-2-bigeasy@linutronix.de commit 488bd2326348c2116cc3f3f43b35d4b679edd190 Author: Thomas Gleixner Date: Fri Jul 8 20:25:16 2011 +0200 Add localversion for -RT release Signed-off-by: Thomas Gleixner commit 1f707261194bd7f262103bf451662b46b9d6eb0b Author: Clark Williams Date: Sat Jul 30 21:55:53 2011 -0500 sysfs: Add /sys/kernel/realtime entry Add a /sys/kernel entry to indicate that the kernel is a realtime kernel. Clark says that he needs this for udev rules, udev needs to evaluate if its a PREEMPT_RT kernel a few thousand times and parsing uname output is too slow or so. Are there better solutions? Should it exist and return 0 on !-rt? Signed-off-by: Clark Williams Signed-off-by: Peter Zijlstra Signed-off-by: Thomas Gleixner commit cd5a1aa40df6af6effb05b461ad62a1e96024d78 Author: Sebastian Andrzej Siewior Date: Fri Oct 11 13:14:41 2019 +0200 POWERPC: Allow to enable RT Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit b7aa07d43963f960f4a8636a5d3c49a9c796859e Author: Sebastian Andrzej Siewior Date: Tue Mar 26 18:31:29 2019 +0100 powerpc/stackprotector: work around stack-guard init from atomic This is invoked from the secondary CPU in atomic context. On x86 we use tsc instead. On Power we XOR it against mftb() so lets use stack address as the initial value. Cc: stable-rt@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit 6099eae95778005308a409e268257033da6b73dc Author: Bogdan Purcareata Date: Fri Apr 24 15:53:13 2015 +0000 powerpc/kvm: Disable in-kernel MPIC emulation for PREEMPT_RT While converting the openpic emulation code to use a raw_spinlock_t enables guests to run on RT, there's still a performance issue. For interrupts sent in directed delivery mode with a multiple CPU mask, the emulated openpic will loop through all of the VCPUs, and for each VCPUs, it call IRQ_check, which will loop through all the pending interrupts for that VCPU. This is done while holding the raw_lock, meaning that in all this time the interrupts and preemption are disabled on the host Linux. A malicious user app can max both these number and cause a DoS. This temporary fix is sent for two reasons. First is so that users who want to use the in-kernel MPIC emulation are aware of the potential latencies, thus making sure that the hardware MPIC and their usage scenario does not involve interrupts sent in directed delivery mode, and the number of possible pending interrupts is kept small. Secondly, this should incentivize the development of a proper openpic emulation that would be better suited for RT. Acked-by: Scott Wood Signed-off-by: Bogdan Purcareata Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit 1de3ac8115604bb31e3f285f006f3754a9da8c78 Author: Sebastian Andrzej Siewior Date: Tue Mar 26 18:31:54 2019 +0100 powerpc/pseries/iommu: Use a locallock instead local_irq_save() The locallock protects the per-CPU variable tce_page. The function attempts to allocate memory while tce_page is protected (by disabling interrupts). Use local_irq_save() instead of local_irq_disable(). Cc: stable-rt@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit 1b6b7e5f162170ebffa693d00caedf65d6ddd9af Author: Sebastian Andrzej Siewior Date: Fri Jul 26 11:30:49 2019 +0200 powerpc: traps: Use PREEMPT_RT Add PREEMPT_RT to the backtrace if enabled. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit 7a8f6eade8af5e77d46883acaa9845b3f38e383d Author: Sebastian Andrzej Siewior Date: Fri Oct 11 13:14:35 2019 +0200 ARM64: Allow to enable RT Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit eacf72911783a432b7cc5bb3dc07373007e6fc40 Author: Sebastian Andrzej Siewior Date: Fri Oct 11 13:14:29 2019 +0200 ARM: Allow to enable RT Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit bf6fafe213c8c9e0f33e2ec2b3ca56a76c01304d Author: Thomas Gleixner Date: Tue Jan 8 21:36:51 2013 +0100 tty/serial/pl011: Make the locking work on RT The lock is a sleeping lock and local_irq_save() is not the optimsation we are looking for. Redo it to make it work on -RT and non-RT. Signed-off-by: Thomas Gleixner commit b32c80d82625d2cd28ebec26ce8b2365b7f3d1a4 Author: Thomas Gleixner Date: Thu Jul 28 13:32:57 2011 +0200 tty/serial/omap: Make the locking RT aware The lock is a sleeping lock and local_irq_save() is not the optimsation we are looking for. Redo it to make it work on -RT and non-RT. Signed-off-by: Thomas Gleixner commit da0f417555b6a9a1340d3a1f35e8f2b9c8b9bc7d Author: Yadi.hu Date: Wed Dec 10 10:32:09 2014 +0800 ARM: enable irq in translation/section permission fault handlers Probably happens on all ARM, with CONFIG_PREEMPT_RT CONFIG_DEBUG_ATOMIC_SLEEP This simple program.... int main() { *((char*)0xc0001000) = 0; }; [ 512.742724] BUG: sleeping function called from invalid context at kernel/rtmutex.c:658 [ 512.743000] in_atomic(): 0, irqs_disabled(): 128, pid: 994, name: a [ 512.743217] INFO: lockdep is turned off. [ 512.743360] irq event stamp: 0 [ 512.743482] hardirqs last enabled at (0): [< (null)>] (null) [ 512.743714] hardirqs last disabled at (0): [] copy_process+0x3b0/0x11c0 [ 512.744013] softirqs last enabled at (0): [] copy_process+0x3b0/0x11c0 [ 512.744303] softirqs last disabled at (0): [< (null)>] (null) [ 512.744631] [] (unwind_backtrace+0x0/0x104) [ 512.745001] [] (dump_stack+0x20/0x24) [ 512.745355] [] (__might_sleep+0x1dc/0x1e0) [ 512.745717] [] (rt_spin_lock+0x34/0x6c) [ 512.746073] [] (do_force_sig_info+0x34/0xf0) [ 512.746457] [] (force_sig_info+0x18/0x1c) [ 512.746829] [] (__do_user_fault+0x9c/0xd8) [ 512.747185] [] (do_bad_area+0x7c/0x94) [ 512.747536] [] (do_sect_fault+0x40/0x48) [ 512.747898] [] (do_DataAbort+0x40/0xa0) [ 512.748181] Exception stack(0xecaa1fb0 to 0xecaa1ff8) Oxc0000000 belongs to kernel address space, user task can not be allowed to access it. For above condition, correct result is that test case should receive a “segment fault” and exits but not stacks. the root cause is commit 02fe2845d6a8 ("avoid enabling interrupts in prefetch/data abort handlers"),it deletes irq enable block in Data abort assemble code and move them into page/breakpiont/alignment fault handlers instead. But author does not enable irq in translation/section permission fault handlers. ARM disables irq when it enters exception/ interrupt mode, if kernel doesn't enable irq, it would be still disabled during translation/section permission fault. We see the above splat because do_force_sig_info is still called with IRQs off, and that code eventually does a: spin_lock_irqsave(&t->sighand->siglock, flags); As this is architecture independent code, and we've not seen any other need for other arch to have the siglock converted to raw lock, we can conclude that we should enable irq for ARM translation/section permission exception. Signed-off-by: Yadi.hu Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit 2b0fd45d316a753fb0cd8b9224b6907df2a34db2 Author: Thomas Gleixner Date: Wed Jul 8 17:14:48 2015 +0200 arm: Disable jump-label on PREEMPT_RT. jump-labels are used to efficiently switch between two possible code paths. To achieve this, stop_machine() is used to keep the CPU in a known state while the opcode is modified. The usage of stop_machine() here leads to large latency spikes which can be observed on PREEMPT_RT. Jump labels may change the target during runtime and are not restricted to debug or "configuration/ setup" part of a PREEMPT_RT system where high latencies could be defined as acceptable. Disable jump-label support on a PREEMPT_RT system. [bigeasy: Patch description.] Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/20220613182447.112191-2-bigeasy@linutronix.de commit a944c32ec3369c6c191164db0d7c5d2fa8e932cc Author: Anders Roxell Date: Thu May 14 17:52:17 2015 +0200 arch/arm64: Add lazy preempt support arm64 is missing support for PREEMPT_RT. The main feature which is lacking is support for lazy preemption. The arch-specific entry code, thread information structure definitions, and associated data tables have to be extended to provide this support. Then the Kconfig file has to be extended to indicate the support is available, and also to indicate that support for full RT preemption is now available. Signed-off-by: Anders Roxell Signed-off-by: Thomas Gleixner commit 004174acf989187c0b95b366b1b5ff9a22da474d Author: Thomas Gleixner Date: Thu Nov 1 10:14:11 2012 +0100 powerpc: Add support for lazy preemption Implement the powerpc pieces for lazy preempt. Signed-off-by: Thomas Gleixner commit 4616fd77c7d61f0cc19457c5efb37f6c61e1b499 Author: Thomas Gleixner Date: Wed Oct 31 12:04:11 2012 +0100 arm: Add support for lazy preemption Implement the arm pieces for lazy preempt. Signed-off-by: Thomas Gleixner commit 47a1858e631b3485e1caf5bafbfb0e14eb4841c9 Author: Thomas Gleixner Date: Tue Jul 13 07:52:52 2021 +0200 entry: Fix the preempt lazy fallout Common code needs common defines.... Fixes: f2f9e496208c ("x86: Support for lazy preemption") Reported-by: kernel test robot Signed-off-by: Thomas Gleixner commit 88d2072447d95b93a1ddcedefb8287def7a90488 Author: Thomas Gleixner Date: Thu Nov 1 11:03:47 2012 +0100 x86: Support for lazy preemption Implement the x86 pieces for lazy preempt. Signed-off-by: Thomas Gleixner commit a7588475b5f1eabeed6ee2c8005a060cda2d28e6 Author: Sebastian Andrzej Siewior Date: Tue Jun 30 11:45:14 2020 +0200 x86/entry: Use should_resched() in idtentry_exit_cond_resched() The TIF_NEED_RESCHED bit is inlined on x86 into the preemption counter. By using should_resched(0) instead of need_resched() the same check can be performed which uses the same variable as 'preempt_count()` which was issued before. Use should_resched(0) instead need_resched(). Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit c5e1aa83b8d6d821df0d3ed54b8ac13b3f1b1897 Author: Thomas Gleixner Date: Fri Oct 26 18:50:54 2012 +0100 sched: Add support for lazy preemption It has become an obsession to mitigate the determinism vs. throughput loss of RT. Looking at the mainline semantics of preemption points gives a hint why RT sucks throughput wise for ordinary SCHED_OTHER tasks. One major issue is the wakeup of tasks which are right away preempting the waking task while the waking task holds a lock on which the woken task will block right after having preempted the wakee. In mainline this is prevented due to the implicit preemption disable of spin/rw_lock held regions. On RT this is not possible due to the fully preemptible nature of sleeping spinlocks. Though for a SCHED_OTHER task preempting another SCHED_OTHER task this is really not a correctness issue. RT folks are concerned about SCHED_FIFO/RR tasks preemption and not about the purely fairness driven SCHED_OTHER preemption latencies. So I introduced a lazy preemption mechanism which only applies to SCHED_OTHER tasks preempting another SCHED_OTHER task. Aside of the existing preempt_count each tasks sports now a preempt_lazy_count which is manipulated on lock acquiry and release. This is slightly incorrect as for lazyness reasons I coupled this on migrate_disable/enable so some other mechanisms get the same treatment (e.g. get_cpu_light). Now on the scheduler side instead of setting NEED_RESCHED this sets NEED_RESCHED_LAZY in case of a SCHED_OTHER/SCHED_OTHER preemption and therefor allows to exit the waking task the lock held region before the woken task preempts. That also works better for cross CPU wakeups as the other side can stay in the adaptive spinning loop. For RT class preemption there is no change. This simply sets NEED_RESCHED and forgoes the lazy preemption counter. Initial test do not expose any observable latency increasement, but history shows that I've been proven wrong before :) The lazy preemption mode is per default on, but with CONFIG_SCHED_DEBUG enabled it can be disabled via: # echo NO_PREEMPT_LAZY >/sys/kernel/debug/sched_features and reenabled via # echo PREEMPT_LAZY >/sys/kernel/debug/sched_features The test results so far are very machine and workload dependent, but there is a clear trend that it enhances the non RT workload performance. Signed-off-by: Thomas Gleixner commit 24f18876695ad5959d68fd5c5550dbb8f143a0d5 Author: Sebastian Andrzej Siewior Date: Mon Feb 21 17:59:14 2022 +0100 Revert "drm/i915: Depend on !PREEMPT_RT." Once the known issues are addressed, it should be safe to enable the driver. Signed-off-by: Sebastian Andrzej Siewior commit 39e153cf1364d777d4c3ac959f0f44832b244d19 Author: Sebastian Andrzej Siewior Date: Fri Oct 1 20:01:03 2021 +0200 drm/i915: Drop the irqs_disabled() check The !irqs_disabled() check triggers on PREEMPT_RT even with i915_sched_engine::lock acquired. The reason is the lock is transformed into a sleeping lock on PREEMPT_RT and does not disable interrupts. There is no need to check for disabled interrupts. The lockdep annotation below already check if the lock has been acquired by the caller and will yell if the interrupts are not disabled. Remove the !irqs_disabled() check. Reported-by: Maarten Lankhorst Signed-off-by: Sebastian Andrzej Siewior commit 0bb7d0ec9b6fc75a7b316778c8c63846901cd74e Author: Sebastian Andrzej Siewior Date: Wed Sep 8 19:03:41 2021 +0200 drm/i915/gt: Use spin_lock_irq() instead of local_irq_disable() + spin_lock() execlists_dequeue() is invoked from a function which uses local_irq_disable() to disable interrupts so the spin_lock() behaves like spin_lock_irq(). This breaks PREEMPT_RT because local_irq_disable() + spin_lock() is not the same as spin_lock_irq(). execlists_dequeue_irq() and execlists_dequeue() has each one caller only. If intel_engine_cs::active::lock is acquired and released with the _irq suffix then it behaves almost as if execlists_dequeue() would be invoked with disabled interrupts. The difference is the last part of the function which is then invoked with enabled interrupts. I can't tell if this makes a difference. From looking at it, it might work to move the last unlock at the end of the function as I didn't find anything that would acquire the lock again. Reported-by: Clark Williams Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Maarten Lankhorst commit 13b5e229e84757ed48bd9e66127eab59b4f9b3ed Author: Sebastian Andrzej Siewior Date: Wed Sep 8 17:18:00 2021 +0200 drm/i915/gt: Queue and wait for the irq_work item. Disabling interrupts and invoking the irq_work function directly breaks on PREEMPT_RT. PREEMPT_RT does not invoke all irq_work from hardirq context because some of the user have spinlock_t locking in the callback function. These locks are then turned into a sleeping locks which can not be acquired with disabled interrupts. Using irq_work_queue() has the benefit that the irqwork will be invoked in the regular context. In general there is "no" delay between enqueuing the callback and its invocation because the interrupt is raised right away on architectures which support it (which includes x86). Use irq_work_queue() + irq_work_sync() instead invoking the callback directly. Reported-by: Clark Williams Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Maarten Lankhorst commit efd2851b223d4071f9e4adccfcd770620080b93e Author: Sebastian Andrzej Siewior Date: Wed Dec 19 10:47:02 2018 +0100 drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with NOTRACE The order of the header files is important. If this header file is included after tracepoint.h was included then the NOTRACE here becomes a nop. Currently this happens for two .c files which use the tracepoitns behind DRM_I915_LOW_LEVEL_TRACEPOINTS. Cc: Steven Rostedt Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit 6175bb7a824da861b1fd2d1061e55f95089a7413 Author: Sebastian Andrzej Siewior Date: Thu Dec 6 09:52:20 2018 +0100 drm/i915: Disable tracing points on PREEMPT_RT Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 | CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10 | Call Trace: | rt_spin_lock+0x3f/0x50 | gen6_read32+0x45/0x1d0 [i915] | g4x_get_vblank_counter+0x36/0x40 [i915] | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915] The tracing events use trace_i915_pipe_update_start() among other events use functions acquire spinlock_t locks which are transformed into sleeping locks on PREEMPT_RT. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also might acquire a sleeping locks on PREEMPT_RT. At the time the arguments are evaluated within trace point, preemption is disabled and so the locks must not be acquired on PREEMPT_RT. Based on this I don't see any other way than disable trace points on PREMPT_RT. Reported-by: Luca Abeni Cc: Steven Rostedt Signed-off-by: Sebastian Andrzej Siewior commit ef1302908f43bd42bd0fe30923fa78ea0a2df7f1 Author: Sebastian Andrzej Siewior Date: Mon Oct 25 15:05:18 2021 +0200 drm/i915: Don't check for atomic context on PREEMPT_RT The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT because the uncore::lock is a spinlock_t and does not disable preemption or interrupts. Changing the uncore:lock to a raw_spinlock_t doubles the worst case latency on an otherwise idle testbox during testing. Therefore I'm currently unsure about changing this. Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/ Signed-off-by: Sebastian Andrzej Siewior commit ce6e63d1e627887ded359e064eb77371327bb7a6 Author: Mike Galbraith Date: Sat Feb 27 09:01:42 2016 +0100 drm/i915: Don't disable interrupts on PREEMPT_RT during atomic updates Commit 8d7849db3eab7 ("drm/i915: Make sprite updates atomic") started disabling interrupts across atomic updates. This breaks on PREEMPT_RT because within this section the code attempt to acquire spinlock_t locks which are sleeping locks on PREEMPT_RT. According to the comment the interrupts are disabled to avoid random delays and not required for protection or synchronisation. If this needs to happen with disabled interrupts on PREEMPT_RT, and the whole section is restricted to register access then all sleeping locks need to be acquired before interrupts are disabled and some function maybe moved after enabling interrupts again. This includes: - prepare_to_wait() + finish_wait() due its wake queue. - drm_crtc_vblank_put() -> vblank_disable_fn() drm_device::vbl_lock. - skl_pfit_enable(), intel_update_plane(), vlv_atomic_update_fifo() and maybe others due to intel_uncore::lock - drm_crtc_arm_vblank_event() due to drm_device::event_lock and drm_device::vblank_time_lock. Don't disable interrupts on PREEMPT_RT during atomic updates. [bigeasy: drop local locks, commit message] Signed-off-by: Mike Galbraith Signed-off-by: Sebastian Andrzej Siewior commit 5ffa1341b9b375201d9612b47807c3c28e20f39b Author: Mike Galbraith Date: Sat Feb 27 08:09:11 2016 +0100 drm/i915: Use preempt_disable/enable_rt() where recommended Mario Kleiner suggest in commit ad3543ede630f ("drm/intel: Push get_scanout_position() timestamping into kms driver.") a spots where preemption should be disabled on PREEMPT_RT. The difference is that on PREEMPT_RT the intel_uncore::lock disables neither preemption nor interrupts and so region remains preemptible. The area covers only register reads and writes. The part that worries me is: - __intel_get_crtc_scanline() the worst case is 100us if no match is found. - intel_crtc_scanlines_since_frame_timestamp() not sure how long this may take in the worst case. It was in the RT queue for a while and nobody complained. Disable preemption on PREEPMPT_RT during timestamping. [bigeasy: patch description.] Cc: Mario Kleiner Signed-off-by: Mike Galbraith Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior commit 9998443ac0c4d5f74b21a2a32a3c2748789fa015 Author: John Ogness Date: Fri Feb 4 16:01:17 2022 +0106 printk: avoid preempt_disable() for PREEMPT_RT During non-normal operation, printk() calls will attempt to write the messages directly to the consoles. This involves using console_trylock() to acquire @console_sem. Preemption is disabled while directly printing to the consoles in order to ensure that the printing task is not scheduled away while holding @console_sem, thus blocking all other printers and causing delays in printing. Commit fd5f7cde1b85 ("printk: Never set console_may_schedule in console_trylock()") specifically reverted a previous attempt at allowing preemption while printing. However, on PREEMPT_RT systems, disabling preemption while printing is not allowed because console drivers typically acquire a spin lock (which under PREEMPT_RT is an rtmutex). Since direct printing is only used during early boot and non-panic dumps, the risks of delayed print output for these scenarios will be accepted under PREEMPT_RT. Signed-off-by: John Ogness Signed-off-by: Sebastian Andrzej Siewior commit 0f3352c867eec058542e57fb7a0cda5829a00d36 Author: John Ogness Date: Fri Feb 4 16:01:17 2022 +0106 serial: 8250: implement write_atomic Implement a non-sleeping NMI-safe write_atomic() console function in order to support atomic console printing during a panic. Trasmitting data requires disabling interrupts. Since write_atomic() can be called from any context, it may be called while another CPU is executing in console code. In order to maintain the correct state of the IER register, use the global cpu_sync to synchronize all access to the IER register. This synchronization is only necessary for serial ports that are being used as consoles. The global cpu_sync is also used to synchronize between the write() and write_atomic() callbacks. write() synchronizes per character, write_atomic() synchronizes per line. Signed-off-by: John Ogness Signed-off-by: Sebastian Andrzej Siewior commit e3c8f28432e291b06c5006013cb24e76f1536204 Author: John Ogness Date: Fri Feb 4 16:01:17 2022 +0106 printk: add infrastucture for atomic consoles Many times it is not possible to see the console output on panic because printing threads cannot be scheduled and/or the console is already taken and forcibly overtaking/busting the locks does provide the hoped results. Introduce a new infrastructure to support "atomic consoles". A new optional callback in struct console, write_atomic(), is available for consoles to provide an implemention for writing console messages. The implementation must be NMI safe if they can run on an architecture where NMIs exist. Console drivers implementing the write_atomic() callback must also select CONFIG_HAVE_ATOMIC_CONSOLE in order to enable the atomic console code within the printk subsystem. If atomic consoles are available, panic() will flush the kernel log only to the atomic consoles (before busting spinlocks). Afterwards, panic() will continue as before, which includes attempting to flush the other (non-atomic) consoles. Signed-off-by: John Ogness Signed-off-by: Sebastian Andrzej Siewior commit 12fb61a66bfee4ca736062eeca7d7921dfe9cf89 Author: Sebastian Andrzej Siewior Date: Tue Jul 19 20:08:01 2022 +0200 printk: Bring back the RT bits. This is a revert of the commits: | 07a22b61946f0 Revert "printk: add functions to prefer direct printing" | 5831788afb17b Revert "printk: add kthread console printers" | 2d9ef940f89e0 Revert "printk: extend console_lock for per-console locking" | 007eeab7e9f03 Revert "printk: remove @console_locked" | 05c96b3713aa2 Revert "printk: Block console kthreads when direct printing will be required" | 20fb0c8272bbb Revert "printk: Wait for the global console lock when the system is going down" which is needed for the atomic consoles which are used on PREEMPT_RT. Signed-off-by: Sebastian Andrzej Siewior commit 6bd5efef3617ffd103ca38af8660028b10323343 Author: Sebastian Andrzej Siewior Date: Fri Mar 11 17:44:57 2022 +0100 locking/lockdep: Remove lockdep_init_map_crosslock. The cross-release bits have been removed, lockdep_init_map_crosslock() is a leftover. Remove lockdep_init_map_crosslock. Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Waiman Long Link: https://lore.kernel.org/r/20220311164457.46461-1-bigeasy@linutronix.de Link: https://lore.kernel.org/r/YqITgY+2aPITu96z@linutronix.de commit c1d1db13ebc9d184c2005c8833318882a02e8ef7 Author: Mike Galbraith Date: Thu Mar 31 04:08:28 2016 +0200 zram: Replace bit spinlocks with spinlock_t for PREEMPT_RT. The bit spinlock disables preemption on PREEMPT_RT. With disabled preemption it is not allowed to acquire other sleeping locks which includes invoking zs_free(). Use a spinlock_t on PREEMPT_RT for locking and set/ clear ZRAM_LOCK after the lock has been acquired/ dropped. Signed-off-by: Mike Galbraith Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/YqIbMuHCPiQk+Ac2@linutronix.de commit 48b2d78e90b665d1669813628084382344a56343 Author: Haris Okanovic Date: Tue Aug 15 15:13:08 2017 -0500 tpm_tis: fix stall after iowrite*()s ioread8() operations to TPM MMIO addresses can stall the cpu when immediately following a sequence of iowrite*()'s to the same region. For example, cyclitest measures ~400us latency spikes when a non-RT usermode application communicates with an SPI-based TPM chip (Intel Atom E3940 system, PREEMPT_RT kernel). The spikes are caused by a stalling ioread8() operation following a sequence of 30+ iowrite8()s to the same address. I believe this happens because the write sequence is buffered (in cpu or somewhere along the bus), and gets flushed on the first LOAD instruction (ioread*()) that follows. The enclosed change appears to fix this issue: read the TPM chip's access register (status code) after every iowrite*() operation to amortize the cost of flushing data to chip across multiple instructions. Signed-off-by: Haris Okanovic Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit b4b444f911965f2ac66e44ceaa60b6c5202983a5 Author: Frederic Weisbecker Date: Tue Apr 5 03:07:52 2022 +0200 tick: Fix timer storm since introduction of timersd If timers are pending while the tick is reprogrammed on nohz_mode, the next expiry is not armed to fire now, it is delayed one jiffy forward instead so as not to raise an inextinguishable timer storm with such scenario: 1) IRQ triggers and queue a timer 2) ksoftirqd() is woken up 3) IRQ tail: timer is reprogrammed to fire now 4) IRQ exit 5) TIMER interrupt 6) goto 3) ...all that until we finally reach ksoftirqd. Unfortunately we are checking the wrong softirq vector bitmask since timersd kthread has split from ksoftirqd. Timers now have their own vector state field that must be checked separately. As a result, the old timer storm is back. This shows up early on boot with extremely long initcalls: [ 333.004807] initcall dquot_init+0x0/0x111 returned 0 after 323822879 usecs and the cause is uncovered with the right trace events showing just 10 microseconds between ticks (~100 000 Hz): |swapper/-1 1dn.h111 60818582us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415486608 |swapper/-1 1dn.h111 60818592us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415496082 |swapper/-1 1dn.h111 60818601us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415505550 Fix this by checking the right timer vector state from the nohz code. Signed-off-by: Frederic Weisbecker Cc: Mel Gorman Cc: Sebastian Andrzej Siewior Cc: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/20220405010752.1347437-2-frederic@kernel.org commit 3b432364d5ddb70106ed11957b97cd34e44c6e2a Author: Frederic Weisbecker Date: Tue Apr 5 03:07:51 2022 +0200 rcutorture: Also force sched priority to timersd on boosting test. ksoftirqd is statically boosted to the priority level right above the one of rcu_torture_boost() so that timers, which torture readers rely on, get a chance to run while rcu_torture_boost() is polling. However timers processing got split from ksoftirqd into their own kthread (timersd) that isn't boosted. It has the same SCHED_FIFO low prio as rcu_torture_boost() and therefore timers can't preempt it and may starve. The issue can be triggered in practice on v5.17.1-rt17 using: ./kvm.sh --allcpus --configs TREE04 --duration 10m --kconfig "CONFIG_EXPERT=y CONFIG_PREEMPT_RT=y" Fix this with statically boosting timersd just like is done with ksoftirqd in commit ea6d962e80b61 ("rcutorture: Judge RCU priority boosting on grace periods, not callbacks") Suggested-by: Mel Gorman Cc: Sebastian Andrzej Siewior Cc: Thomas Gleixner Signed-off-by: Frederic Weisbecker Link: https://lkml.kernel.org/r/20220405010752.1347437-1-frederic@kernel.org Signed-off-by: Sebastian Andrzej Siewior commit 8b36250a3001f64cfc8ab4e85bece134f5ef9ac2 Author: Sebastian Andrzej Siewior Date: Wed Dec 1 17:41:09 2021 +0100 softirq: Use a dedicated thread for timer wakeups. A timer/hrtimer softirq is raised in-IRQ context. With threaded interrupts enabled or on PREEMPT_RT this leads to waking the ksoftirqd for the processing of the softirq. Once the ksoftirqd is marked as pending (or is running) it will collect all raised softirqs. This in turn means that a softirq which would have been processed at the end of the threaded interrupt, which runs at an elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER priority and competes with every regular task for CPU resources. This introduces long delays on heavy loaded systems and is not desired especially if the system is not overloaded by the softirqs. Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated timers thread and let it run at the lowest SCHED_FIFO priority. RT tasks are are woken up from hardirq context so only timer_list timers and hrtimers for "regular" tasks are processed here. The higher priority ensures that wakeups are performed before scheduling SCHED_OTHER tasks. Using a dedicated variable to store the pending softirq bits values ensure that the timer are not accidentally picked up by ksoftirqd and other threaded interrupts. It shouldn't be picked up by ksoftirqd since it runs at lower priority. However if the timer bits are ORed while a threaded interrupt is running, then the timer softirq would be performed at higher priority. The new timer thread will block on the softirq lock before it starts softirq work. This "race window" isn't closed because while timer thread is performing the softirq it can get PI-boosted via the softirq lock by a random force-threaded thread. The timer thread can pick up pending softirqs from ksoftirqd but only if the softirq load is high. It is not be desired that the picked up softirqs are processed at SCHED_FIFO priority under high softirq load but this can already happen by a PI-boost by a force-threaded interrupt. Reported-by: kernel test robot [ static timer_threads ] Signed-off-by: Sebastian Andrzej Siewior commit 789fdb59071a26ccd627e1d9c879fb8e4f6dc54d Author: Sebastian Andrzej Siewior Date: Thu Nov 7 17:49:20 2019 +0100 x86: Enable RT also on 32bit Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit 652085c2191eb49c42a757142b0469d09784c127 Author: Sebastian Andrzej Siewior Date: Wed Aug 7 18:15:38 2019 +0200 x86: Allow to enable RT Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner commit 00fe154600a3f4a27dcea20604f3de42ec5f2641 Author: Sebastian Andrzej Siewior Date: Mon Aug 15 11:39:52 2022 +0200 flex_proportions: Disable preemption entering the write section. The seqcount fprop_global::sequence is not associated with a lock. The write section (fprop_new_period()) is invoked from a timer and since the softirq is preemptible on PREEMPT_RT it is possible to preempt the write section which is not desited. Disable premption around the write section. Signed-off-by: Sebastian Andrzej Siewior commit 51ecbca65130102286760fecaacd9ff619472267 Author: Sebastian Andrzej Siewior Date: Mon Aug 15 17:29:50 2022 +0200 net: Avoid the IPI to free the skb_attempt_defer_free() collects a skbs, which was allocated on a remote CPU, on a per-CPU list. These skbs are either freed on that remote CPU once the CPU enters NET_RX or an remote IPI function is invoked in to raise the NET_RX softirq if a threshold of pending skb has been exceeded. This remote IPI can cause the wakeup of ksoftirqd on PREEMPT_RT if the remote CPU idle was idle. This is undesired because once the ksoftirqd is running it will acquire all pending softirqs and they will not be executed as part of the threaded interrupt until ksoftird goes idle again. To void all this, schedule the deferred clean up from a worker. Signed-off-by: Sebastian Andrzej Siewior commit 37a8744bbd5fcb663efcbebf1ddc1b6e19ae3984 Author: Sebastian Andrzej Siewior Date: Wed Jun 22 12:27:05 2022 +0200 sched: Consider task_struct::saved_state in wait_task_inactive(). Ptrace is using wait_task_inactive() to wait for the tracee to reach a certain task state. On PREEMPT_RT that state may be stored in task_struct::saved_state while the tracee blocks on a sleeping lock and task_struct::__state is set to TASK_RTLOCK_WAIT. It is not possible to check only for TASK_RTLOCK_WAIT to be sure that the task is blocked on a sleeping lock because during wake up (after the sleeping lock has been acquired) the task state is set TASK_RUNNING. After the task in on CPU and acquired the pi_lock it will reset the state accordingly but until then TASK_RUNNING will be observed (with the desired state saved in saved_state). Check also for task_struct::saved_state if the desired match was not found in task_struct::__state on PREEMPT_RT. If the state was found in saved_state, wait until the task is idle and state is visible in task_struct::__state. Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/Yt%2FpQAFQ1xKNK0RY@linutronix.de commit 22ba0a69ed2d73219c7079fec23e19744bb9c604 Author: Sebastian Andrzej Siewior Date: Wed Jun 22 11:36:17 2022 +0200 signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT. Commit 53da1d9456fe7 ("fix ptrace slowness") is just band aid around the problem. The invocation of do_notify_parent_cldstop() wakes the parent and makes it runnable. The scheduler then wants to replace this still running task with the parent. With the read_lock() acquired this is not possible because preemption is disabled and so this is deferred until read_unlock(). This scheduling point is undesired and is avoided by disabling preemption around the unlock operation enabled again before the schedule() invocation without a preemption point. This is only undesired because the parent sleeps a cycle in wait_task_inactive() until the traced task leaves the run-queue in schedule(). It is not a correctness issue, it is just band aid to avoid the visbile delay which sums up over multiple invocations. The task can still be preempted if an interrupt occurs between preempt_enable_no_resched() and freezable_schedule() because on the IRQ-exit path of the interrupt scheduling _will_ happen. This is ignored since it does not happen very often. On PREEMPT_RT keeping preemption disabled during the invocation of cgroup_enter_frozen() becomes a problem because the function acquires css_set_lock which is a sleeping lock on PREEMPT_RT and must not be acquired with disabled preemption. Don't disable preemption on PREEMPT_RT. Remove the TODO regarding adding read_unlock_no_resched() as there is no need for it and will cause harm. Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/20220720154435.232749-2-bigeasy@linutronix.de commit c18934414c643231ba27581aa1f0c263d1d6d00b Author: Sebastian Andrzej Siewior Date: Mon Aug 1 11:34:33 2022 +0200 lib/vsprintf: Initialize vsprintf's pointer hash once the random core is ready. The printk code invokes vnsprintf in order to compute the complete string before adding it into its buffer. This happens in an IRQ-off region which leads to a warning on PREEMPT_RT in the random code if the format strings contains a %p for pointer printing. This happens because the random core acquires locks which become sleeping locks on PREEMPT_RT which must not be acquired with disabled interrupts and or preemption disabled. By default the pointers are hashed which requires a random value on the first invocation (either by printk or another user which comes first. One could argue that there is no need for printk to disable interrupts during the vsprintf() invocation which would fix the just mentioned problem. However printk itself can be invoked in a context with disabled interrupts which would lead to the very same problem. Move the initialization of ptr_key into a worker and schedule it from subsys_initcall(). This happens early but after the workqueue subsystem is ready. Use get_random_bytes() to retrieve the random value if the RNG core is ready, otherwise schedule a worker in two seconds and try again. Reported-by: Mike Galbraith Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/YueeIgPGUJgsnsAh@linutronix.de commit bea6a87978e4b94cdaf83f5963fdba19441f991e Author: Sebastian Andrzej Siewior Date: Fri Jul 29 15:52:45 2022 +0200 lib/vsprintf: Remove static_branch_likely() from __ptr_to_hashval(). Using static_branch_likely() to signal that ptr_key has been filled is a bit much given that it is not a fast path. Replace static_branch_likely() with bool for condition and a memory barrier for ptr_key. Suggested-by: Petr Mladek Signed-off-by: Sebastian Andrzej Siewior Link: https://lkml.kernel.org/r/20220729154716.429964-2-bigeasy@linutronix.de commit 35cc1d213d79ae8860e7d4f3dea964384810a26e Author: Sebastian Andrzej Siewior Date: Mon May 9 16:04:08 2022 +0200 genirq: Provide generic_handle_domain_irq_safe(). Provide generic_handle_domain_irq_safe() which can used from any context. This similar to commit 509853f9e1e7b ("genirq: Provide generic_handle_irq_safe()") but this time for the irq-domains interface. It has been reported for the amd-pinctrl driver via bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=215954 I looked around and added a few users so it is not just one user API :) Instead of generic_handle_irq(irq_find_mapping)) one can use generic_handle_domain_irq(). The problem with generic_handle_domain_irq() is that with `threadirqs' it will trigger "WARN_ON_ONCE(!in_hardirq())". That interrupt handler can't be marked non-threaded because it is a shared handler (it is marked as such and I can't tell the interrupt can be really shared on the system). Ignoring the just mentioned warning, on PREEMPT_RT the threaded handler is invoked with enabled interrupts leading other problems. Do we do this? Signed-off-by: Sebastian Andrzej Siewior Link: https://lore.kernel.org/r/YnkfWFzvusFFktSt@linutronix.de