commit b60b47dcf7285e47fce3beef47e450cc351c9ec2 Author: Alexandre Frade Date: Sat Nov 3 20:03:54 2018 -0300 4.19.0-xanmod1_rev3 Signed-off-by: Alexandre Frade commit 802f6ff0d11ae2e2a1176ca02cd253ea08c13dd1 Author: Alexandre Frade Date: Sat Nov 3 19:14:42 2018 -0300 config: set latency based cgroup io protection Signed-off-by: Alexandre Frade commit 1b89991b093bd53749806e70fd8d94f48988a2fb Author: Mark Weiman Date: Sun Aug 12 11:36:21 2018 -0400 pci: Enable overrides for missing ACS capabilities This an updated version of Alex Williamson's patch from: https://lkml.org/lkml/2013/5/30/513 Original commit message follows: PCIe ACS (Access Control Services) is the PCIe 2.0+ feature that allows us to control whether transactions are allowed to be redirected in various subnodes of a PCIe topology. For instance, if two endpoints are below a root port or downsteam switch port, the downstream port may optionally redirect transactions between the devices, bypassing upstream devices. The same can happen internally on multifunction devices. The transaction may never be visible to the upstream devices. One upstream device that we particularly care about is the IOMMU. If a redirection occurs in the topology below the IOMMU, then the IOMMU cannot provide isolation between devices. This is why the PCIe spec encourages topologies to include ACS support. Without it, we have to assume peer-to-peer DMA within a hierarchy can bypass IOMMU isolation. Unfortunately, far too many topologies do not support ACS to make this a steadfast requirement. Even the latest chipsets from Intel are only sporadically supporting ACS. We have trouble getting interconnect vendors to include the PCIe spec required PCIe capability, let alone suggested features. Therefore, we need to add some flexibility. The pcie_acs_override= boot option lets users opt-in specific devices or sets of devices to assume ACS support. The "downstream" option assumes full ACS support on root ports and downstream switch ports. The "multifunction" option assumes the subset of ACS features available on multifunction endpoints and upstream switch ports are supported. The "id:nnnn:nnnn" option enables ACS support on devices matching the provided vendor and device IDs, allowing more strategic ACS overrides. These options may be combined in any order. A maximum of 16 id specific overrides are available. It's suggested to use the most limited set of options necessary to avoid completely disabling ACS across the topology. Note to hardware vendors, we have facilities to permanently quirk specific devices which enforce isolation but not provide an ACS capability. Please contact me to have your devices added and save your customers the hassle of this boot option. Signed-off-by: Mark Weiman commit c6196fae8ffe3e49ee6447cfe741ab78114bf141 Author: Alfred Chen Date: Mon Oct 29 15:37:04 2018 +0800 Tag PDS 0.99c commit 219d033895a7fc78b5dc58ab3bfc99a43764a439 Author: Alfred Chen Date: Mon Oct 29 15:32:23 2018 +0800 pds: Optimize ISO branch in normal_prio(). commit e2b6f42ae7034de41be6a7d4544ef0ee318da3dd Author: Alfred Chen Date: Thu Oct 18 15:42:49 2018 +0000 pds: Enable SCHED_DEADLINE support. Enable SCHED_DEADLINE support by squashing into priority 0 SCHED_FIFO tasks. commit 807f2913fa0629b8567100d1dda2d6574f4ddce0 Author: Alfred Chen Date: Mon Oct 15 14:08:17 2018 +0000 pds: Handle SCHED_BATCH as SCHED_IDLE. commit 41255458b9654007f17335feab259dd2669738af Author: Alfred Chen Date: Thu Oct 11 15:24:39 2018 +0800 pds: Unified time slice for FIFO tasks. commit 3a9afa6b8d94e5cca6bf4a9eab39721b7649f371 Author: Alfred Chen Date: Sat Sep 29 09:46:51 2018 +0800 pds: Fix and sync-up reset on fork handling. commit b277dbc1464b7097e8e4774d8cb6b812025357d4 Author: Alexandre Frade Date: Wed Oct 24 22:02:46 2018 -0300 4.19.0-xanmod1_rev2 Signed-off-by: Alexandre Frade commit 6d261a96235bd792e980f75c78ecb0fb82eb15e1 Author: Alexandre Frade Date: Wed Oct 24 21:52:31 2018 -0300 config: set cake qdisc by default Signed-off-by: Alexandre Frade commit d62c0b4460e9baf854cc4d8e018d45486ca22254 Author: Alexandre Frade Date: Wed Oct 24 16:58:52 2018 -0300 net/sched: allow configuring cake qdisc as default Signed-off-by: Alexandre Frade commit 311b62e01d622729692a35c497d621ee27552800 Author: Alexandre Frade Date: Mon Oct 22 18:06:51 2018 -0300 4.19.0-xanmod1 Signed-off-by: Alexandre Frade commit 6042c60f996930f1374ac60bba8b0058fe2b9f1b Author: J. R. Okajima Date: Mon Oct 22 17:19:04 2018 -0300 aufs: add aufs4.x-rcN 20181022 commit 8204082b8ca089d57da73599b3b822f39747eaed Author: Omar Sandoval Date: Fri Sep 28 09:22:50 2018 -0700 kyber: fix integer overflow of latency targets on 32-bit NSEC_PER_SEC has type long, so 5 * NSEC_PER_SEC is calculated as a long. However, 5 seconds is 5,000,000,000 nanoseconds, which overflows a 32-bit long. Make sure all of the targets are calculated as 64-bit values. Fixes: 6e25cb01ea20 ("kyber: implement improved heuristics") Reported-by: Stephen Rothwell Signed-off-by: Omar Sandoval commit 46a3c20c03a6321ec19f9c66b5345989a32833c3 Author: Omar Sandoval Date: Thu Sep 27 15:55:55 2018 -0700 kyber: add tracepoints When debugging Kyber, it's really useful to know what latencies we've been having, how the domain depths have been adjusted, and if we've actually been throttling. Add three tracepoints, kyber_latency, kyber_adjust, and kyber_throttled, to record that. Signed-off-by: Omar Sandoval commit 3874e76acb2b54db327fdbae772ec64498176177 Author: Omar Sandoval Date: Thu Sep 27 15:55:54 2018 -0700 kyber: implement improved heuristics Kyber's current heuristics have a few flaws: - It's based on the mean latency, but p99 latency tends to be more meaningful to anyone who cares about latency. The mean can also be skewed by rare outliers that the scheduler can't do anything about. - The statistics calculations are purely time-based with a short window. This works for steady, high load, but is more sensitive to outliers with bursty workloads. - It only considers the latency once an I/O has been submitted to the device, but the user cares about the time spent in the kernel, as well. These are shortcomings of the generic blk-stat code which doesn't quite fit the ideal use case for Kyber. So, this replaces the statistics with a histogram used to calculate percentiles of total latency and I/O latency, which we then use to adjust depths in a slightly more intelligent manner: - Sync and async writes are now the same domain. - Discards are a separate domain. - Domain queue depths are scaled by the ratio of the p99 total latency to the target latency (e.g., if the p99 latency is double the target latency, we will double the queue depth; if the p99 latency is half of the target latency, we can halve the queue depth). - We use the I/O latency to determine whether we should scale queue depths down: we will only scale down if any domain's I/O latency exceeds the target latency, which is an indicator of congestion in the device. These new heuristics are just as scalable as the heuristics they replace. Signed-off-by: Omar Sandoval commit 4c9e49fac96f8ebc41da400f97876023db2c760a Author: Omar Sandoval Date: Thu Sep 27 15:55:53 2018 -0700 kyber: don't make domain token sbitmap larger than necessary The domain token sbitmaps are currently initialized to the device queue depth or 256, whichever is larger, and immediately resized to the maximum depth for that domain (256, 128, or 64 for read, write, and other, respectively). The sbitmap is never resized larger than that, so it's unnecessary to allocate a bitmap larger than the maximum depth. Let's just allocate it to the maximum depth to begin with. This will use marginally less memory, and more importantly, give us a more appropriate number of bits per sbitmap word. Signed-off-by: Omar Sandoval commit d909d8d713045ab8368db76fa339bccea344531b Author: Omar Sandoval Date: Thu Sep 27 15:55:52 2018 -0700 block: export blk_stat_enable_accounting() Kyber will need this in a future change if it is built as a module. Signed-off-by: Omar Sandoval commit 2e2e528e0354a908b2f7f9d7cc7847d47d38d05e Author: Omar Sandoval Date: Thu Sep 27 15:55:51 2018 -0700 block: move call of scheduler's ->completed_request() hook Commit 4bc6339a583c ("block: move blk_stat_add() to __blk_mq_end_request()") consolidated some calls using ktime_get() so we'd only need to call it once. Kyber's ->completed_request() hook also calls ktime_get(), so let's move it to the same place, too. Signed-off-by: Omar Sandoval commit be44d397728eab52b86da05161ac63d5eeeef605 Author: Alexandre Frade Date: Mon Jan 29 18:36:35 2018 +0000 block: set rq_affinity = 2 for full multithreading I/O requests Signed-off-by: Alexandre Frade commit 8c657f24b150598b4c40ea38f8ac043a03d093ba Author: Alexandre Frade Date: Mon Jan 29 18:29:13 2018 +0000 sched/core: nr_migrate = 128 increases number of tasks to iterate in a single balance run. Signed-off-by: Alexandre Frade commit 1055d06e07e2a23f078c9bdcc4c4662ce5aecf77 Author: Alexandre Frade Date: Mon Jan 29 17:55:52 2018 +0000 cpufreq: tunes ondemand governor for performance Signed-off-by: Alexandre Frade commit b13e401885f95faad0c2d048a5127faf495b86f2 Author: Alexandre Frade Date: Mon Jan 29 17:41:29 2018 +0000 disable the localversion "+" tag of a git repo Signed-off-by: Alexandre Frade commit f69221ba05c9e1fc60af7e1ae9a73468ef3d1b28 Author: Alexandre Frade Date: Mon Jan 29 17:36:22 2018 +0000 mm/zswap: set to use lz4 compressor Signed-off-by: Alexandre Frade commit 65a65dba819ecb59fc570cee7efa7356ea1303a6 Author: Alexandre Frade Date: Mon Jan 29 17:31:25 2018 +0000 mm/vmscan: vm_swappiness = 30 decreases the amount of swapping Signed-off-by: Alexandre Frade commit 3601a7d373b8d7dd567de4aed18d9a789a364d1e Author: Alexandre Frade Date: Mon Jan 29 17:26:15 2018 +0000 kconfig: add 500Hz timer interrupt kernel config Signed-off-by: Alexandre Frade commit f6dfefa25e21d080ff18f9597e760109f345c209 Author: Alexandre Frade Date: Mon Jan 29 17:21:43 2018 +0000 mm: set 128/2048 (min/max) kilobytes to read-ahead for filesystems on this block device Signed-off-by: Alexandre Frade commit c70ea4e33cb6b92c32b4d14158d44f7ed4908381 Author: Alexandre Frade Date: Mon Jan 29 16:59:22 2018 +0000 dcache: cache_pressure = 50 decreases the rate at which VFS caches are reclaimed Signed-off-by: Alexandre Frade commit 0a7af55fc9a0fe141cb4ca9f881849e610b88fbc Author: Alexandre Frade Date: Thu Jul 6 03:03:36 2017 +0000 add trace events for open(), exec() and uselib() Signed-off-by: Alexandre Frade commit 204b1e1904cd8a0caeab2f95fd11a08b793f6f84 Author: Alexandre Frade Date: Tue Aug 14 15:46:14 2018 -0300 elevator: set default scheduler to kyber for blk-mq Signed-off-by: Alexandre Frade commit 442973b943ed7697d73d42f3b103d77a844037ea Author: Alfred Chen Date: Mon Oct 22 13:58:51 2018 +0000 pds: PDS 0.99b for v4.19 kernel commit 79a6643523cf51189edc4a086b8b3ea34b06e6f2 Author: graysky Date: Mon Oct 22 16:58:08 2018 -0300 x86/Kconfig: Enable additional cpu optimizations for gcc v8.1+ kernel v4.13+ WARNING This patch works with gcc versions 8.1+ and with kernel version 4.13+ and should NOT be applied when compiling on older versions of gcc due to key name changes of the march flags introduced with the version 4.9 release of gcc.[1] Use the older version of this patch hosted on the same github for older versions of gcc. FEATURES This patch adds additional CPU options to the Linux kernel accessible under: Processor type and features ---> Processor family ---> The expanded microarchitectures include: * AMD Improved K8-family * AMD K10-family * AMD Family 10h (Barcelona) * AMD Family 14h (Bobcat) * AMD Family 16h (Jaguar) * AMD Family 15h (Bulldozer) * AMD Family 15h (Piledriver) * AMD Family 15h (Steamroller) * AMD Family 15h (Excavator) * AMD Family 17h (Zen) * Intel Silvermont low-power processors * Intel 1st Gen Core i3/i5/i7 (Nehalem) * Intel 1.5 Gen Core i3/i5/i7 (Westmere) * Intel 2nd Gen Core i3/i5/i7 (Sandybridge) * Intel 3rd Gen Core i3/i5/i7 (Ivybridge) * Intel 4th Gen Core i3/i5/i7 (Haswell) * Intel 5th Gen Core i3/i5/i7 (Broadwell) * Intel 6th Gen Core i3/i5/i7 (Skylake) * Intel 6th Gen Core i7/i9 (Skylake X) * Intel 8th Gen Core i3/i5/i7 (Cannon Lake) * Intel 8th Gen Core i7/i9 (Ice Lake) It also offers to compile passing the 'native' option which, "selects the CPU to generate code for at compilation time by determining the processor type of the compiling machine. Using -march=native enables all instruction subsets supported by the local machine and will produce code optimized for the local machine under the constraints of the selected instruction set."[3] MINOR NOTES This patch also changes 'atom' to 'bonnell' in accordance with the gcc v4.9 changes. Note that upstream is using the deprecated 'match=atom' flags when I believe it should use the newer 'march=bonnell' flag for atom processors.[2] It is not recommended to compile on Atom-CPUs with the 'native' option.[4] The recommendation is to use the 'atom' option instead. BENEFITS Small but real speed increases are measurable using a make endpoint comparing a generic kernel to one built with one of the respective microarchs. See the following experimental evidence supporting this statement: https://github.com/graysky2/kernel_gcc_patch REQUIREMENTS linux version >=3.15 gcc version >=8.1 ACKNOWLEDGMENTS This patch builds on the seminal work by Jeroen.[5] REFERENCES 1. https://gcc.gnu.org/gcc-4.9/changes.html 2. https://bugzilla.kernel.org/show_bug.cgi?id=77461 3. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html 4. https://github.com/graysky2/kernel_gcc_patch/issues/15 5. http://www.linuxforge.net/docs/linux/linux-gcc.php commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d Author: Greg Kroah-Hartman Date: Mon Oct 22 07:37:37 2018 +0100 Linux 4.19