commit 5571b31c393ae781ab56d07df3488a427230e0da Author: Alexandre Frade Date: Tue Jun 3 04:26:06 2025 +0000 Linux 6.15.0-xanmod1 Signed-off-by: Alexandre Frade commit 30baa1b4955dc66847d75f9854db0f5379603711 Author: Alexandre Frade Date: Thu Jan 23 23:30:58 2025 +0000 XANMOD: .gitlab-ci: Add gitlab-ci.yml file Signed-off-by: Alexandre Frade commit cab6a9176d346f913b60d69cc7baec364c1839c0 Author: Paolo Pisati Date: Thu Feb 6 15:40:09 2025 +0100 binder: turn into module - lock_vma_under_rcu() Signed-off-by: Paolo Pisati Signed-off-by: Alexandre Frade commit cd99e3f84d8921d276bc2a3195eaa130bb1ac949 Author: Paolo Pisati Date: Thu Feb 6 15:38:05 2025 +0100 binder: turn into module - list_lru_add()/list_lru_del() Signed-off-by: Paolo Pisati Signed-off-by: Alexandre Frade commit 684b5a6f0dca865fe04fd43b78c5f7cd104d6c38 Author: Paolo Pisati Date: Tue Jan 23 16:33:16 2024 +0100 file: export file_close_fd() instead of close_fd_get_file() Following the refactor done in: commit a88c955fcfb49727d0ed86b47410f6555a8e69e4 Author: Christian Brauner Date: Thu Nov 30 13:49:07 2023 +0100 file: s/close_fd_get_file()/file_close_fd()/g update the export directive we added in: commit 17f7fa4baad268cc4a93205747222be931699788 Author: Christian Brauner Date: Wed Jan 16 23:13:25 2019 +0100 UBUNTU: SAUCE: binder: turn into module Anbox probably needs a review too. Signed-off-by: Paolo Pisati Signed-off-by: Alexandre Frade commit 4d482e4f54f8d5274d7a71f04e01045e10b9b815 Author: Andrea Righi Date: Fri Dec 17 11:14:24 2021 +0100 wait: allow to use __wake_up_pollfree() from GPL modules commit ebafbcf7f32d ("UBUNTU: SAUCE: binder: turn into module") is changing binder to be a module, but __wake_up_pollfree() can only be used internally by the kernel. Make __wake_up_pollfree an EXPORT_SYMBOL_GPL so that it can be used by the binder module. Signed-off-by: Andrea Righi Signed-off-by: Alexandre Frade commit a57b29708d518d4d287feb31cee40dc35ddf11be Author: Christian Brauner Date: Wed Jan 23 21:54:23 2019 +0100 binder: give binder_alloc its own debug mask file Currently both binder.c and binder_alloc.c both register the /sys/module/binder_linux/paramters/debug_mask file which leads to conflicts in sysfs. This commit gives binder_alloc.c its own /sys/module/binder_linux/paramters/alloc_debug_mask file. Signed-off-by: Christian Brauner Signed-off-by: Seth Forshee Signed-off-by: Alexandre Frade commit 80e5366e077a6ee8ca523fe38128a88409e585db Author: Christian Brauner Date: Wed Jan 16 23:13:25 2019 +0100 binder: turn into module The Android binder driver needs to become a module for the sake of shipping Anbox. To do this we need to export the following functions since binder is currently still using them: - security_binder_set_context_mgr() - security_binder_transaction() - security_binder_transfer_binder() - security_binder_transfer_file() - can_nice() - __close_fd_get_file() - mmput_async() - task_work_add() - map_kernel_range_noflush() - get_vm_area() - zap_page_range_single() - put_ipc_ns() - get_ipc_ns_exported() - show_init_ipc_ns() Signed-off-by: Christian Brauner [ saf: fix additional reference to init_ipc_ns from 5.0-rc6 ] Signed-off-by: Seth Forshee [ arighi: fix EXPORT_SYMBOL vs EXPORT_SYMBOL_GPL change from 6.0-rc5 ] [ arighi: zap_page_range() has been dropped, export zap_page_range_single() in 6.3 ] Signed-off-by: Andrea Righi Signed-off-by: Alexandre Frade commit 3d51435b0255331ad337172efc339ce2aa441426 Author: Serge Hallyn Date: Fri May 31 19:12:12 2013 +0100 sysctl: add sysctl to disallow unprivileged CLONE_NEWUSER by default add sysctl to disallow unprivileged CLONE_NEWUSER by default This is a short-term patch. Unprivileged use of CLONE_NEWUSER is certainly an intended feature of user namespaces. However for at least saucy we want to make sure that, if any security issues are found, we have a fail-safe. Signed-off-by: Serge Hallyn [bwh: Remove unneeded binary sysctl bits] [bwh: Keep this sysctl, but change the default to enabled] Signed-off-by: Alexandre Frade commit 12b5da81d6f5ceafbdd329b4705615e1aad9425d Author: Mark Weiman Date: Sun Aug 12 11:36:21 2018 -0400 PCI: Enable overrides for missing ACS capabilities This an updated version of Alex Williamson's patch from: https://lkml.org/lkml/2013/5/30/513 Original commit message follows: PCIe ACS (Access Control Services) is the PCIe 2.0+ feature that allows us to control whether transactions are allowed to be redirected in various subnodes of a PCIe topology. For instance, if two endpoints are below a root port or downsteam switch port, the downstream port may optionally redirect transactions between the devices, bypassing upstream devices. The same can happen internally on multifunction devices. The transaction may never be visible to the upstream devices. One upstream device that we particularly care about is the IOMMU. If a redirection occurs in the topology below the IOMMU, then the IOMMU cannot provide isolation between devices. This is why the PCIe spec encourages topologies to include ACS support. Without it, we have to assume peer-to-peer DMA within a hierarchy can bypass IOMMU isolation. Unfortunately, far too many topologies do not support ACS to make this a steadfast requirement. Even the latest chipsets from Intel are only sporadically supporting ACS. We have trouble getting interconnect vendors to include the PCIe spec required PCIe capability, let alone suggested features. Therefore, we need to add some flexibility. The pcie_acs_override= boot option lets users opt-in specific devices or sets of devices to assume ACS support. The "downstream" option assumes full ACS support on root ports and downstream switch ports. The "multifunction" option assumes the subset of ACS features available on multifunction endpoints and upstream switch ports are supported. The "id:nnnn:nnnn" option enables ACS support on devices matching the provided vendor and device IDs, allowing more strategic ACS overrides. These options may be combined in any order. A maximum of 16 id specific overrides are available. It's suggested to use the most limited set of options necessary to avoid completely disabling ACS across the topology. Note to hardware vendors, we have facilities to permanently quirk specific devices which enforce isolation but not provide an ACS capability. Please contact me to have your devices added and save your customers the hassle of this boot option. Rebased-by: Alexandre Frade Signed-off-by: Mark Weiman Signed-off-by: Alexandre Frade commit 87529fccee25c9a34fa12a520bc27e05f405fb6e Author: Andrey Smirnov Date: Sun Sep 24 15:02:33 2023 -0700 mfd: steamdeck: Expose controller board power in sysfs As of version 118 Deck's BIOS implements "SCBP" method that allows gating power of the controller board (VBUS). Add a basic WO method to our root MFD device to allow toggling that. Signed-off-by: Andrey Smirnov (cherry picked from commit f97f32718acc10cbb51fef925842392e80904d74) Signed-off-by: Cristian Ciocaltea Signed-off-by: Alexandre Frade commit 320b47642fcb44eceb16ce0173c141054bccdd29 Author: Andrey Smirnov Date: Sat Feb 19 16:08:36 2022 -0800 mfd: Add MFD core driver for Steam Deck Add MFD core driver for Steam Deck. Doesn't really do much so far besides instantiating a number of MFD cells that implement all the interesting functionality. (cherry picked from commit 5f534c2d6ebdefccb9c024eb0f013bc1c0c622d9) Signed-off-by: Cristian Ciocaltea Signed-off-by: Alexandre Frade commit 2d19daf86be46c9895cb410283d468524051ab1b Author: Andrey Smirnov Date: Sun Feb 27 12:58:05 2022 -0800 leds: steamdeck: Add support for Steam Deck LED (cherry picked from commit 85a86d19aa7022ff0555023d53aef78323a42d0c) Signed-off-by: Cristian Ciocaltea Signed-off-by: Alexandre Frade commit 63eec42540f62c3deba122135552610378613374 Author: Andrey Smirnov Date: Sat Jul 15 12:58:54 2023 -0700 hwmon: steamdeck-hwmon: Add support for max battery level/rate Add support for max battery level/charge rate attributes. Signed-off-by: Andrey Smirnov (cherry picked from commit 50af83e8fd75dc52221edd3fb6fd7a7f70c4d8a4) Signed-off-by: Cristian Ciocaltea Signed-off-by: Alexandre Frade commit 9038a9c2be753aa7bc449f974f3855900e826cfe Author: Andrey Smirnov Date: Sat Feb 19 16:09:45 2022 -0800 hwmon: Add driver for Steam Deck's EC sensors Add driver for sensors exposed by EC firmware on Steam Deck hardware. (cherry picked from commit 6917aac77bee6185ae3920b936cdbe7876118c0b) Signed-off-by: Cristian Ciocaltea Signed-off-by: Alexandre Frade commit 69345b0be46558ffb1d14748cf0e60d43c196319 Author: Andrey Smirnov Date: Sun Feb 27 14:46:08 2022 -0800 extcon: Add driver for Steam Deck (cherry picked from commit f9f2eddae582ae39d5f89c1218448fc259b90aa8) Signed-off-by: Cristian Ciocaltea Signed-off-by: Alexandre Frade commit fbc448279feeb8d29c2238688ffd5a8ebfc98492 Author: Felix Fietkau Date: Tue Feb 20 15:56:02 2018 +0100 netfilter: add xt_FLOWOFFLOAD target Signed-off-by: Felix Fietkau Signed-off-by: Alexandre Frade commit 2e8031e6ee6d37bbc9211e3782064be4566cd465 Author: Alexandre Frade Date: Mon Feb 27 01:38:18 2023 +0000 netfilter: Add netfilter nf_tables fullcone support Signed-off-by: Syrone Wong Signed-off-by: Alexandre Frade commit d157ebce2d10985625d9e351f5306e58222698a1 Author: mfreemon@cloudflare.com Date: Tue Mar 1 17:06:02 2022 -0600 tcp: Add a sysctl to skip tcp collapse processing when the receive buffer is full For context and additional information about this patch, see the blog post at https://blog.cloudflare.com/optimizing-tcp-for-high-throughput-and-low-latency/ sysctl: net.ipv4.tcp_collapse_max_bytes If tcp_collapse_max_bytes is non-zero, attempt to collapse the queue to free up memory if the current amount of memory allocated is less than tcp_collapse_max_bytes. Otherwise, the packet is dropped without attempting to collapse the queue. If tcp_collapse_max_bytes is zero, this feature is disabled and the default Linux behavior is used. The default Linux behavior is to always perform the attempt to collapse the queue to free up memory. When the receive queue is small, we want to collapse the queue. There are two reasons for this: (a) the latency of performing the collapse will be small on a small queue, and (b) we want to avoid sending a congestion signal (via a packet drop) to the sender when the receive queue is small. The result is that we avoid latency spikes caused by the time it takes to perform the collapse logic when the receive queue is large and full, while preserving existing behavior and performance for all other cases. Signed-off-by: Alexandre Frade commit ec17b73b48652f9123fc47681546c4433ab7a80a Author: Neal Cardwell Date: Sun Jul 23 23:25:34 2023 -0400 tcp: export TCPI_OPT_ECN_LOW in tcp_info tcpi_options field Analogous to other important ECN information, export TCPI_OPT_ECN_LOW in tcp_info tcpi_options field. Signed-off-by: Neal Cardwell Change-Id: I08d8d8c7e8780e6e37df54038ee50301ac5a0320 Signed-off-by: Alexandre Frade commit 70a8a26e6a2826b2db5a4edf4fcc895da1238573 Author: Adithya Abraham Philip Date: Fri Jun 11 21:56:10 2021 +0000 net-tcp_bbr: v3: ensure ECN-enabled BBR flows set ECT on retransmits Adds a new flag TCP_ECN_ECT_PERMANENT that is used by CCAs to indicate that retransmitted packets and pure ACKs must have the ECT bit set. This is necessary for BBR, which when using ECN expects ECT to be set even on retransmitted packets and ACKs. Previous to this addition of TCP_ECN_ECT_PERMANENT, CCAs which can use ECN but don't "need" it did not have a way to indicate that ECT should be set on retransmissions/ACKs. Signed-off-by: Adithya Abraham Philip Signed-off-by: Neal Cardwell Change-Id: I8b048eaab35e136fe6501ef6cd89fd9faa15e6d2 Signed-off-by: Alexandre Frade commit 6902346b40bdc6e997969c2dfcca5f7e20f58c6b Author: Neal Cardwell Date: Tue Jun 11 12:54:22 2019 -0400 net-tcp_bbr: v3: update TCP "bbr" congestion control module to BBRv3 BBR v3 is an enhacement to the BBR v1 algorithm. It's designed to aim for lower queues, lower loss, and better Reno/CUBIC coexistence than BBR v1. BBR v3 maintains the core of BBR v1: an explicit model of the network path that is two-dimensional, adapting to estimate the (a) maximum available bandwidth and (b) maximum safe volume of data a flow can keep in-flight in the network. It maintains the estimated BDP as a core guide for estimating an appropriate level of in-flight data. BBR v3 makes several key enhancements: o Its bandwidth-probing time scale is adapted, within bounds, to allow improved coexistence with Reno and CUBIC. The bandwidth-probing time scale is (a) extended dynamically based on estimated BDP to improve coexistence with Reno/CUBIC; (b) bounded by an interactive wall-clock time-scale to be more scalable and responsive than Reno and CUBIC. o Rather than being largely agnostic to loss and ECN marks, it explicitly uses loss and (DCTCP-style) ECN signals to maintain its model. o It aims for lower losses than v1 by adjusting its model to attempt to stay within loss rate and ECN mark rate bounds (loss_thresh and ecn_thresh, respectively). o It adapts to loss/ECN signals even when the application is running out of data ("application-limited"), in case the "application-limited" flow is also "network-limited" (the bw and/or inflight available to this flow is lower than previously estimated when the flow ran out of data). o It has a three-part model: the model explicit three tracks operating points, where an operating point is a tuple: (bandwidth, inflight). The three operating points are: o latest: the latest measurement from the current round trip o upper bound: robust, optimistic, long-term upper bound o lower bound: robust, conservative, short-term lower bound These are stored in the following state variables: o latest: bw_latest, inflight_latest o lo: bw_lo, inflight_lo o hi: bw_hi[2], inflight_hi To gain intuition about the meaning of the three operating points, it may help to consider the analogs in CUBIC, which has a somewhat analogous three-part model used by its probing state machine: BBR param CUBIC param ----------- ------------- latest ~ cwnd lo ~ ssthresh hi ~ last_max_cwnd The analogy is only a loose one, though, since the BBR operating points are calculated differently, and are 2-dimensional (bw,inflight) rather than CUBIC's one-dimensional notion of operating point (inflight). o It uses the three-part model to adapt the magnitude of its bandwidth to match the estimated space available in the buffer, rather than (as in BBR v1) assuming that it was always acceptable to place 0.25*BDP in the bottleneck buffer when probing (commodity datacenter switches commonly do not have that much buffer for WAN flows). When BBR v3 estimates it hit a buffer limit during probing, its bandwidth probing then starts gently in case little space is still available in the buffer, and the accelerates, slowly at first and then rapidly if it can grow inflight without seeing congestion signals. In such cases, probing is bounded by inflight_hi + inflight_probe, where inflight_probe grows as: [0, 1, 2, 4, 8, 16,...]. This allows BBR to keep losses low and bounded if a bottleneck remains congested, while rapidly/scalably utilizing free bandwidth when it becomes available. o It has a slightly revised state machine, to achieve the goals above. BBR_BW_PROBE_UP: pushes up inflight to probe for bw/vol BBR_BW_PROBE_DOWN: drain excess inflight from the queue BBR_BW_PROBE_CRUISE: use pipe, w/ headroom in queue/pipe BBR_BW_PROBE_REFILL: try refill the pipe again to 100%, leaving queue empty o The estimated BDP: BBR v3 continues to maintain an estimate of the path's two-way propagation delay, by tracking a windowed min_rtt, and coordinating (on an as-ndeeded basis) to try to expose the two-way propagation delay by draining the bottleneck queue. BBR v3 continues to use its min_rtt and (currently-applicable) bandwidth estimate to estimate the current bandwidth-delay product. The estimated BDP still provides one important guideline for bounding inflight data. However, because any min-filtered RTT and max-filtered bw inherently tend to both overestimate, the estimated BDP is often too high; in this case loss or ECN marks can ensue, in which case BBR v3 adjusts inflight_hi and inflight_lo to adapt its sending rate and inflight down to match the available capacity of the path. o Space: Note that ICSK_CA_PRIV_SIZE increased. This is because BBR v3 requires more space. Note that much of the space is due to support for per-socket parameterization and debugging in this release for research and debugging. With that state removed, the full "struct bbr" is 140 bytes, or 144 with padding. This is an increase of 40 bytes over the existing ca_priv space. o Code: BBR v3 reuses many pieces from BBR v1. But it omits the following significant pieces: o "packet conservation" (bbr_set_cwnd_to_recover_or_restore(), bbr_can_grow_inflight()) o long-term bandwidth estimator ("policer mode") The code layout tries to keep BBR v3 code near the bottom of the file, so that v1-applicable code in the top does not accidentally refer to v3 code. o Docs: See the following docs for more details and diagrams decsribing the BBR v3 algorithm: https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-an-update-on-bbr-00 https://datatracker.ietf.org/meeting/102/materials/slides-102-iccrg-an-update-on-bbr-work-at-google-00 o Internal notes: For this upstream rebase, Neal started from: git show fed518041ac6:net/ipv4/tcp_bbr.c > net/ipv4/tcp_bbr.c then removed dev instrumentation (dynamic get/set for parameters) and code that was only used by BBRv1 Effort: net-tcp_bbr Origin-9xx-SHA1: 2c84098e60bed6d67dde23cd7538c51dee273102 Change-Id: I125cf26ba2a7a686f2fa5e87f4c2afceb65f7a05 Signed-off-by: Alexandre Frade commit afa4090dad7e7b5cb2b3713c8bb401e70f260cb6 Author: David Morley Date: Fri Jul 14 11:07:56 2023 -0400 tcp: introduce per-route feature RTAX_FEATURE_ECN_LOW Define and implement a new per-route feature, RTAX_FEATURE_ECN_LOW. This feature indicates that the given destination network is a low-latency ECN environment, meaning both that ECN CE marks are applied by the network using a low-latency marking threshold and also that TCP endpoints provide precise per-data-segment ECN feedback in ACKs (where the ACK ECE flag echoes the received CE status of all newly-acknowledged data segments). This feature indication can be used by congestion control algorithms to decide how to interpret ECN signals over the given destination network. This feature is appropriate for datacenter-style ECN marking, such as the ECN marking approach expected by DCTCP or BBR congestion control modules. Signed-off-by: David Morley Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Tested-by: David Morley Change-Id: I6bc06e9c6cb426fbae7243fc71c9a8c18175f5d3 Signed-off-by: Alexandre Frade commit a218f2f7267c665c4dbc471772fe2ac17a725d31 Author: Neal Cardwell Date: Mon Sep 21 14:46:26 2020 -0400 net-tcp_bbr: v2: introduce is_acking_tlp_retrans_seq into rate_sample Introduce is_acking_tlp_retrans_seq into rate_sample. This bool will export to the CC module the knowledge of whether the current ACK matched a TLP retransmit. Note that when this bool is true, we cannot yet tell (in general) whether this ACK is for the original or the TLP retransmit. Effort: net-tcp_bbr Change-Id: I2e6494332167e75efcbdc99bd5c119034e9c39b4 Signed-off-by: Alexandre Frade commit 369d67dd695d892dcdebf06a49b476e5e1556f37 Author: Jianfeng Wang Date: Tue Jun 16 17:41:19 2020 +0000 net-tcp_bbr: v2: inform CC module of losses repaired by TLP probe Before this commit, when there is a packet loss that creates a sequence hole that is filled by a TLP loss probe, then tcp_process_tlp_ack() only informs the congestion control (CC) module via a back-to-back entry and exit of CWR. But some congestion control modules (e.g. BBR) do not respond to CWR events. This commit adds a new CA event with which the core TCP stack notifies the CC module when a loss is repaired by a TLP. This will allow CC modules that do not use the CWR mechanism to have a custom handler for such TLP recoveries. Effort: net-tcp_bbr Change-Id: Ieba72332b401b329bff5a641d2b2043a3fb8f632 Signed-off-by: Alexandre Frade commit 9d05a6e5d9de543e016e00d361622dbbd4b34cd2 Author: Jianfeng Wang Date: Fri Jun 19 17:33:45 2020 +0000 net-tcp_bbr: v2: record app-limited status of TLP-repaired flight When sending a TLP retransmit, record whether the outstanding flight of data is application limited. This is important for congestion control modules that want to respond to losses repaired by TLP retransmits. This is important because the following scenarios convey very different information: (1) a packet loss with a small number of packets in flight; (2) a packet loss with the maximum amount of data in flight allowed by the CC module; Effort: net-tcp_bbr Change-Id: Ic8ae567caa4e4bfd5fd82c3d4be12a5d9171655e Signed-off-by: Alexandre Frade commit d3b0463dd69081eb55d8fc763e8e81037c0a4159 Author: Neal Cardwell Date: Sat Nov 16 13:16:25 2019 -0500 net-tcp: add fast_ack_mode=1: skip rwin check in tcp_fast_ack_mode__tcp_ack_snd_check() Add logic for an optional TCP connection behavior, enabled with tp->fast_ack_mode = 1, which disables checking the receive window before sending an ack in __tcp_ack_snd_check(). If this behavior is enabled, the data receiver sends an ACK if the amount of data is > RCV.MSS. TCP congestion control modules can enable this bit if they want to generate ACKs quickly. Change-Id: Iaa0a0fd7108221f883137a79d5bfa724f1b096d4 Signed-off-by: Alexandre Frade commit 534420f09ea9cccdf24a3a1b91324e6f2082f719 Author: Neal Cardwell Date: Fri Sep 27 17:10:26 2019 -0400 net-tcp: re-generalize TSO sizing in TCP CC module API Reorganize the API for CC modules so that the CC module once again gets complete control of the TSO sizing decision. This is how the API was set up around 2016 and the initial BBRv1 upstreaming. Later Eric Dumazet simplified it. But with wider testing it now seems that to avoid CPU regressions BBR needs to have a different TSO sizing function. This is necessary to handle cases where there are many flows bottlenecked on the sender host's NIC, in which case BBR's pacing rate is much lower than CUBIC/Reno/DCTCP's. Why does this happen? Because BBR's pacing rate adapts to the low bandwidth share each flow sees. By contrast, CUBIC/Reno/DCTCP see no loss or ECN, so they grow a very large cwnd, and thus large pacing rate and large TSO burst size. Change-Id: Ic8ccfdbe4010ee8d4bf6a6334c48a2fceb2171ea Signed-off-by: Alexandre Frade commit 9a379d988109b58825cdf2f42365c549d0bd7d0c Author: Yousuk Seung Date: Wed May 23 17:55:54 2018 -0700 net-tcp: add new ca opts flag TCP_CONG_WANTS_CE_EVENTS Add a a new ca opts flag TCP_CONG_WANTS_CE_EVENTS that allows a congestion control module to receive CE events. Currently congestion control modules have to set the TCP_CONG_NEEDS_ECN bit in opts flag to receive CE events but this may incur changes in ECN behavior elsewhere. This patch adds a new bit TCP_CONG_WANTS_CE_EVENTS that allows congestion control modules to receive CE events independently of TCP_CONG_NEEDS_ECN. Effort: net-tcp Origin-9xx-SHA1: 9f7e14716cde760bc6c67ef8ef7e1ee48501d95b Change-Id: I2255506985242f376d910c6fd37daabaf4744f24 Signed-off-by: Alexandre Frade commit a763cc8750f739c897dbc5aee313dd9475f934b0 Author: Neal Cardwell Date: Wed May 1 20:16:25 2019 -0400 net-tcp_bbr: v2: adjust skb tx.in_flight upon split in tcp_fragment() When we fragment an skb that has already been sent, we need to update the tx.in_flight for the first skb in the resulting pair ("buff"). Because we were not updating the tx.in_flight, the tx.in_flight value was inconsistent with the pcount of the "buff" skb (tx.in_flight would be too high). That meant that if the "buff" skb was lost, then bbr2_inflight_hi_from_lost_skb() would calculate an inflight_hi value that is too high. This could result in longer queues and higher packet loss. Packetdrill testing verified that without this commit, when the second half of an skb is SACKed and then later the first half of that skb is marked lost, the calculated inflight_hi was incorrect. Effort: net-tcp_bbr Origin-9xx-SHA1: 385f1ddc610798fab2837f9f372857438b25f874 Origin-9xx-SHA1: a0eb099690af net-tcp_bbr: v2: fix tcp_fragment() tx.in_flight recomputation [prod feb 8 2021; use as a fixup] Origin-9xx-SHA1: 885503228153ff0c9114e net-tcp_bbr: v2: introduce tcp_skb_tx_in_flight_is_suspicious() helper for warnings Change-Id: I617f8cab4e9be7a0b8e8d30b047bf8645393354d Signed-off-by: Alexandre Frade commit a7778afe1714f00158baf1928cd6ab702d94e606 Author: Neal Cardwell Date: Wed May 1 20:16:33 2019 -0400 net-tcp_bbr: v2: adjust skb tx.in_flight upon merge in tcp_shifted_skb() When tcp_shifted_skb() updates state as adjacent SACKed skbs are coalesced, previously the tx.in_flight was not adjusted, so we could get contradictory state where the skb's recorded pcount was bigger than the tx.in_flight (the number of segments that were in_flight after sending the skb). Normally have a SACKed skb with contradictory pcount/tx.in_flight would not matter. However, with SACK reneging, the SACKed bit is removed, and an skb once again becomes eligible for retransmitting, fragmenting, SACKing, etc. Packetdrill testing verified the following sequence is possible in a kernel that does not have this commit: - skb N is SACKed - skb N+1 is SACKed and combined with skb N using tcp_shifted_skb() - tcp_shifted_skb() will increase the pcount of prev, but leave tx.in_flight as-is - so prev skb can have pcount > tx.in_flight - RTO, tcp_timeout_mark_lost(), detect reneg, remove "SACKed" bit, mark skb N as lost - find pcount of skb N is greater than its tx.in_flight I suspect this issue iw what caused the bbr2_inflight_hi_from_lost_skb(): WARN_ON_ONCE(inflight_prev < 0) to fire in production machines using bbr2. Effort: net-tcp_bbr Origin-9xx-SHA1: 1a3e997e613d2dcf32b947992882854ebe873715 Change-Id: I1b0b75c27519953430c7db51c6f358f104c7af55 Signed-off-by: Alexandre Frade commit 15ec6b0744ade4ddaa9977f456d2586880338611 Author: Neal Cardwell Date: Tue Aug 7 21:52:06 2018 -0400 net-tcp_bbr: v2: introduce ca_ops->skb_marked_lost() CC module callback API For connections experiencing reordering, RACK can mark packets lost long after we receive the SACKs/ACKs hinting that the packets were actually lost. This means that CC modules cannot easily learn the volume of inflight data at which packet loss happens by looking at the current inflight or even the packets in flight when the most recently SACKed packet was sent. To learn this, CC modules need to know how many packets were in flight at the time lost packets were sent. This new callback, combined with TCP_SKB_CB(skb)->tx.in_flight, allows them to learn this. This also provides a consistent callback that is invoked whether packets are marked lost upon ACK processing, using the RACK reordering timer, or at RTO time. Effort: net-tcp_bbr Origin-9xx-SHA1: afcbebe3374e4632ac6714d39e4dc8a8455956f4 Change-Id: I54826ab53df636be537e5d3c618a46145d12d51a Signed-off-by: Alexandre Frade commit daa677543ca8b6a9c9ec7ad8df36bdcbe7dcc3e1 Author: Neal Cardwell Date: Mon Nov 19 13:48:36 2018 -0500 net-tcp_bbr: v2: export FLAG_ECE in rate_sample.is_ece For understanding the relationship between inflight and ECN signals, to try to find the highest inflight value that has acceptable levels ECN marking. Effort: net-tcp_bbr Origin-9xx-SHA1: 3eba998f2898541406c2666781182200934965a8 Change-Id: I3a964e04cee83e11649a54507043d2dfe769a3b3 Signed-off-by: Alexandre Frade commit 808c280b8f1d745e6fe96c90000a9a5e2803b2fb Author: Neal Cardwell Date: Thu Oct 12 23:44:27 2017 -0400 net-tcp_bbr: v2: count packets lost over TCP rate sampling interval For understanding the relationship between inflight and packet loss signals, to try to find the highest inflight value that has acceptable levels of packet losses. Effort: net-tcp_bbr Origin-9xx-SHA1: 4527e26b2bd7756a88b5b9ef1ada3da33dd609ab Change-Id: I594c2500868d9c530770e7ddd68ffc87c57f4fd5 Signed-off-by: Alexandre Frade commit fc97e19fe0fb6f3c1601f815fd35085f19d17dbe Author: Neal Cardwell Date: Sat Aug 5 11:49:50 2017 -0400 net-tcp_bbr: v2: snapshot packets in flight at transmit time and pass in rate_sample CC algorithms may want to snapshot the number of packets in flight at transmit time and pass in rate_sample, to understand the relationship between inflight and losses or ECN signals, to try to find the highest inflight value that has acceptable levels of loss/ECN marking. We split out the code to set an skb's tx.in_flight field into its own function, so that this code can be used for the TCP_REPAIR "fake send" code path that inserts skbs into the rtx queue without sending them. Effort: net-tcp_bbr Origin-9xx-SHA1: b3eb4f2d20efab4ca001f32c9294739036c493ea Origin-9xx-SHA1: e880fc907d06ea7354333f60f712748ebce9497b Origin-9xx-SHA1: 330f825a08a6fe92cef74d799cc468864c479f63 Change-Id: I7314047d0ff14dd261a04b1969a46dc658c8836a Signed-off-by: Alexandre Frade commit ccf1f8b3b8efd3b1a92c766d21877489b3285b4c Author: Neal Cardwell Date: Sun Jun 24 21:55:59 2018 -0400 net-tcp_bbr: v2: shrink delivered_mstamp, first_tx_mstamp to u32 to free up 8 bytes Free up some space for tracking inflight and losses for each bw sample, in upcoming commits. These timestamps are in microseconds, and are now stored in 32 bits. So they can only hold time intervals up to roughly 2^12 = 4096 seconds. But Linux TCP RTT and RTO tracking has the same 32-bit microsecond implementation approach and resulting deployment limitations. So this is not introducing a new limit. And these should not be a limitation for the foreseeable future. Effort: net-tcp_bbr Origin-9xx-SHA1: 238a7e6b5d51625fef1ce7769826a7b21b02ae55 Change-Id: I3b779603797263b52a61ad57c565eb91fe42680c Signed-off-by: Alexandre Frade commit 8443bae6a9b2ccdd0c7e1b81ae000bda98bb1b7f Author: Neal Cardwell Date: Tue Jun 11 12:26:55 2019 -0400 net-tcp_bbr: broaden app-limited rate sample detection This commit is a bug fix for the Linux TCP app-limited (application-limited) logic that is used for collecting rate (bandwidth) samples. Previously the app-limited logic only looked for "bubbles" of silence in between application writes, by checking at the start of each sendmsg. But "bubbles" of silence can also happen before retransmits: e.g. bubbles can happen between an application write and a retransmit, or between two retransmits. Retransmits are triggered by ACKs or timers. So this commit checks for bubbles of app-limited silence upon ACKs or timers. Why does this commit check for app-limited state at the start of ACKs and timer handling? Because at that point we know whether inflight was fully using the cwnd. During processing the ACK or timer event we often change the cwnd; after changing the cwnd we can't know whether inflight was fully using the old cwnd. Origin-9xx-SHA1: 3fe9b53291e018407780fb8c356adb5666722cbc Change-Id: I37221506f5166877c2b110753d39bb0757985e68 Signed-off-by: Alexandre Frade commit b9f553c0adf914bb86119b94ef7d7a3e03e242ec Author: Steven Barrett Date: Sat May 21 15:15:09 2022 -0500 ZEN: dm-crypt: Disable workqueues for crypto ops Queueing in dm-crypt for crypto operations reduces performance on modern systems. As discussed in an article from Cloudflare, they discovered that queuing was introduced because the crypto subsystem used to be synchronous. Since it's now asynchronous, we get double queueing when using the subsystem through dm-crypt. This is obviously undesirable and reduces throughput and increases latency. Fixes: https://github.com/zen-kernel/zen-kernel/issues/282 Signed-off-by: Alexandre Frade commit 9f2ff312b6c42dcc5186e411356a478476988fb5 Author: Kenny Levinsen Date: Sun Dec 27 14:43:13 2020 +0000 ZEN: input/evdev: Use call_rcu when detaching client Significant time was spent on synchronize_rcu in evdev_detach_client when applications closed evdev devices. Switching VT away from a graphical environment commonly leads to mass input device closures, which could lead to noticable delays on systems with many input devices. Replace synchronize_rcu with call_rcu, deferring reclaim of the evdev client struct till after the RCU grace period instead of blocking the calling application. While this does not solve all slow evdev fd closures, it takes care of a good portion of them, including this simple test: #include #include int main(int argc, char *argv[]) { int idx, fd; const char *path = "/dev/input/event0"; for (idx = 0; idx < 1000; idx++) { if ((fd = open(path, O_RDWR)) == -1) { return -1; } close(fd); } return 0; } Time to completion of above test when run locally: Before: 0m27.111s After: 0m0.018s Signed-off-by: Kenny Levinsen Signed-off-by: Alexandre Frade commit df7899f83f4df180e20f453143236b6bd4c17071 Author: Arjan van de Ven Date: Thu Jun 2 23:36:32 2016 -0500 drivers: initialize ata before graphics ATA init is the long pole in the boot process, and its asynchronous. move the graphics init after it so that ata and graphics initialize in parallel Signed-off-by: Alexandre Frade commit 4f578b8569ddc367789dd392068a24d78eab00ee Author: Arjan van de Ven Date: Sun Feb 18 23:35:41 2018 +0000 locking: rwsem: spin faster tweak rwsem owner spinning a bit Signed-off-by: Alexandre Frade commit f76935528685408e066722ab3bdb619410a6a56d Author: William Douglas Date: Wed Jun 20 17:23:21 2018 +0000 firmware: Enable stateless firmware loading Prefer the order of specific version before generic and /etc before /lib to enable the user to give specific overrides for generic firmware and distribution firmware. Signed-off-by: Alexandre Frade commit fb9169174cdf22daf6f6f5dfc35a13c295284de9 Author: Arjan van de Ven Date: Thu Dec 13 01:00:49 2018 +0000 sched/wait: Do accept() in LIFO order for cache efficiency Signed-off-by: Alexandre Frade commit 0c56777dadb2edb5909851b1e82d6a51a43a1971 Author: Alexandre Frade Date: Mon Nov 18 22:23:17 2024 +0000 XANMOD: Makefile: Move ARM and x86 instruction set selection to kernel-wide build Signed-off-by: Alexandre Frade commit d96ba934d5083048f9dde583e5248d870a2caeca Author: graysky Date: Mon Sep 16 05:55:58 2024 -0400 x86/kconfig: more ISA levels and uarches FEATURES This patch adds additional tunings via new x86-64 ISA levels and more micro-architecture options to the Linux kernel in three classes. 1. New generic x86-64 ISA levels These are selectable under: Processor type and features ---> x86-64 compiler ISA level • x86-64 A value of (1) is the default • x86-64-v2 A value of (2) brings support for vector instructions up to Streaming SIMD Extensions 4.2 (SSE4.2) and Supplemental Streaming SIMD Extensions 3 (SSSE3), the POPCNT instruction, and CMPXCHG16B. • x86-64-v3 A value of (3) adds vector instructions up to AVX2, MOVBE, and additional bit-manipulation instructions. There is also x86-64-v4 but including this makes little sense as the kernel does not use any of the AVX512 instructions anyway. Users of glibc 2.33 and above can see which level is supported by running: /lib/ld-linux-x86-64.so.2 --help | grep supported Or /lib64/ld-linux-x86-64.so.2 --help | grep supported 2. New micro-architectures These are selectable under: Processor type and features ---> Processor family • AMD Improved K8-family • AMD K10-family • AMD Family 10h (Barcelona) • AMD Family 14h (Bobcat) • AMD Family 16h (Jaguar) • AMD Family 15h (Bulldozer) • AMD Family 15h (Piledriver) • AMD Family 15h (Steamroller) • AMD Family 15h (Excavator) • AMD Family 17h (Zen) • AMD Family 17h (Zen 2) • AMD Family 19h (Zen 3)** • AMD Family 19h (Zen 4)‡ • AMD Family 1Ah (Zen 5)§ • Intel Silvermont low-power processors • Intel Goldmont low-power processors (Apollo Lake and Denverton) • Intel Goldmont Plus low-power processors (Gemini Lake) • Intel 1st Gen Core i3/i5/i7 (Nehalem) • Intel 1.5 Gen Core i3/i5/i7 (Westmere) • Intel 2nd Gen Core i3/i5/i7 (Sandybridge) • Intel 3rd Gen Core i3/i5/i7 (Ivybridge) • Intel 4th Gen Core i3/i5/i7 (Haswell) • Intel 5th Gen Core i3/i5/i7 (Broadwell) • Intel 6th Gen Core i3/i5/i7 (Skylake) • Intel 6th Gen Core i7/i9 (Skylake X) • Intel 8th Gen Core i3/i5/i7 (Cannon Lake) • Intel 10th Gen Core i7/i9 (Ice Lake) • Intel Xeon (Cascade Lake) • Intel Xeon (Cooper Lake)* • Intel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake)* • Intel 4th Gen 10nm++ Xeon (Sapphire Rapids)† • Intel 11th Gen i3/i5/i7/i9-family (Rocket Lake)† • Intel 12th Gen i3/i5/i7/i9-family (Alder Lake)† • Intel 13th Gen i3/i5/i7/i9-family (Raptor Lake)‡ • Intel 14th Gen i3/i5/i7/i9-family (Meteor Lake)‡ • Intel 5th Gen 10nm++ Xeon (Emerald Rapids)‡ Notes: If not otherwise noted, gcc >=9.1 is required for support. *Requires gcc >=10.1 or clang >=10.0 **Required gcc >=10.3 or clang >=12.0 †Required gcc >=11.1 or clang >=12.0 ‡Required gcc >=13.0 or clang >=15.0.5 §Required gcc >14.0 or clang >=19.0? 3. Auto-detected micro-architecture levels Compile by passing the '-march=native' option which, "selects the CPU to generate code for at compilation time by determining the processor type of the compiling machine. Using -march=native enables all instruction subsets supported by the local machine and will produce code optimized for the local machine under the constraints of the selected instruction set."[1] Users of Intel CPUs should select the 'Intel-Native' option and users of AMD CPUs should select the 'AMD-Native' option. MINOR NOTES RELATING TO INTEL ATOM PROCESSORS This patch also changes -march=atom to -march=bonnell in accordance with the gcc v4.9 changes. Upstream is using the deprecated -match=atom flags when I believe it should use the newer -march=bonnell flag for atom processors.[2] It is not recommended to compile on Atom-CPUs with the 'native' option.[3] The recommendation is to use the 'atom' option instead. BENEFITS Small but real speed increases are measurable using a make endpoint comparing a generic kernel to one built with one of the respective microarchs. See the following experimental evidence supporting this statement: https://github.com/graysky2/kernel_compiler_patch?tab=readme-ov-file#benchmarks REQUIREMENTS linux version 6.1.79+ gcc version >=9.0 or clang version >=9.0 ACKNOWLEDGMENTS This patch builds on the seminal work by Jeroen.[4] REFERENCES 1. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-x86-Options 2. https://bugzilla.kernel.org/show_bug.cgi?id=77461 3. https://github.com/graysky2/kernel_gcc_patch/issues/15 4. http://www.linuxforge.net/docs/linux/linux-gcc.php Signed-off-by: Alexandre Frade commit 4436de2c1bbd587ef81beb91d062e4c154b0708f Author: Alexandre Frade Date: Mon Jun 2 19:11:59 2025 +0000 x86/cpu: Re-add configuration options for early 64-bit CPUs This reintroduces commit f388f60ca9041a95c9b3f157d316ed7c8f297e44 Signed-off-by: Alexandre Frade commit 56a19ed77558c3978443531c61e8f63a9a9f7ff3 Author: Alexandre Frade Date: Mon Nov 18 20:17:44 2024 +0000 XANMOD: x86/build: Prevent generating avx2 floating-point code Signed-off-by: Alexandre Frade commit 1d9cedcee2e4475ba4ee0cfc508381097fcf3ffa Author: Alexandre Frade Date: Mon Apr 24 04:50:34 2023 +0000 XANMOD: scripts/setlocalversion: Move localversion* files to the end Signed-off-by: Alexandre Frade commit 870ee888b2322440edf60de7f2529981497b0f3c Author: Alexandre Frade Date: Sun May 29 00:57:40 2022 +0000 XANMOD: scripts/setlocalversion: remove "+" tag for git repo short version Signed-off-by: Alexandre Frade commit 50e8a4480d7ffbcf785f5bbf3ac246cf5d6f88fa Author: Alexandre Frade Date: Mon Sep 16 08:09:56 2024 +0000 XANMOD: lib/kconfig.debug: disable default SYMBOLIC_ERRNAME and DEBUG_BUGVERBOSE Signed-off-by: Alexandre Frade commit 815f35565d14e96ccaba1d51790732f26abc7df7 Author: Alexandre Frade Date: Tue Mar 31 13:32:08 2020 -0300 XANMOD: cpufreq: tunes ondemand and conservative governor for performance Signed-off-by: Alexandre Frade commit 9834c713e9db04d2dda246298944ec6e9b37fd16 Author: Alexandre Frade Date: Wed Jun 15 17:07:29 2022 +0000 XANMOD: sched/autogroup: Add kernel parameter and config option to enable/disable autogroup feature by default Signed-off-by: Alexandre Frade commit 0be3827f4e483160811b889a7879c372cc3c3457 Author: Alexandre Frade Date: Fri May 30 19:58:58 2025 +0000 XANMOD: mm/vmscan: Reduce amount of swapping Signed-off-by: Alexandre Frade commit 31245605b572331eefefb133857733e7dfd3ddaa Author: Alexandre Frade Date: Sun Apr 28 09:06:54 2024 +0000 XANMOD: mm: Raise max_map_count default value Signed-off-by: Alexandre Frade commit 4dd984f0e9520d18fc2af71caba0db8cc0a79b08 Author: Alexandre Frade Date: Mon Jan 29 16:59:22 2018 +0000 XANMOD: dcache: cache_pressure = 50 decreases the rate at which VFS caches are reclaimed Signed-off-by: Alexandre Frade commit a6a62caee939bdc41a370a20e0d7fa6d926a2067 Author: Alexandre Frade Date: Mon Jan 29 17:26:15 2018 +0000 XANMOD: kconfig: add 500Hz timer interrupt kernel config option Signed-off-by: Alexandre Frade commit 2efc2aa5e7baa2a8568954de094ee51273a4ddeb Author: Alexandre Frade Date: Mon Jul 15 04:50:34 2024 +0000 XANMOD: blk-wbt: Set wbt_default_latency_nsec() to 2msec Signed-off-by: Alexandre Frade commit 5256fceec40a9d1d6430900d1701abb013fc4840 Author: Alexandre Frade Date: Mon Sep 16 15:36:01 2024 +0000 XANMOD: block: Set rq_affinity to force complete I/O requests on same CPU Signed-off-by: Alexandre Frade commit e555e6786e627803e7bd27ccae16029ebdc5e23c Author: Alexandre Frade Date: Thu Jan 6 16:59:01 2022 +0000 XANMOD: block/mq-deadline: Disable front_merges by default Signed-off-by: Alexandre Frade commit be04c513e48c60763a3582d2fc61e9759516f68b Author: Alexandre Frade Date: Wed May 11 18:56:51 2022 +0000 XANMOD: block/mq-deadline: Increase write priority to improve responsiveness Signed-off-by: Alexandre Frade commit e5c96ee84cc54ae354b083d6595ca6cb53f4cee1 Author: Alexandre Frade Date: Sun Sep 15 23:03:38 2024 +0000 XANMOD: sched: Add yield_type sysctl to reduce or disable sched_yield Signed-off-by: Alexandre Frade commit fabc9733326c78d0ca504a68c0403307dea0abcd Author: Alexandre Frade Date: Thu May 11 19:41:41 2023 +0000 XANMOD: fair: Set scheduler tunable latencies to unscaled Signed-off-by: Alexandre Frade commit ffa23504a2832c42ecb1cb0496eea4711dfa7bc9 Author: Alexandre Frade Date: Sat Aug 31 16:57:41 2024 +0000 kbuild: Remove GCC minimal function alignment Signed-off-by: Alexandre Frade commit 8718985a89da4ac886d2393641ba4060059f96a3 Author: Alexandre Frade Date: Thu Nov 28 22:55:27 2024 +0000 kbuild: Re-add .config file required to sign external modules Signed-off-by: Alexandre Frade commit 962b9b3205ba377b19fca19ab3a329db02659a7e Author: Alexandre Frade Date: Mon Sep 16 00:55:35 2024 +0000 XANMOD: kbuild: Add GCC SMS-based modulo scheduling flags Signed-off-by: Alexandre Frade commit 46853b043481aed5dfb08787a53636e884a9b191 Author: Alexandre Frade Date: Mon Aug 28 05:00:29 2023 +0000 XANMOD: x86/build: Add more CFLAGS optimizations Signed-off-by: Alexandre Frade commit 0ff41df1cb268fc69e703a08a57ee14ae967d0ca Author: Linus Torvalds Date: Sun May 25 16:09:23 2025 -0700 Linux 6.15