Gentoo Archives: gentoo-user

From: james <garftd@×××××××.net>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] kde-apps/kde-l10n-16.04.3:5/5::gentoo conflicting with kde-apps/kdepim-l10n-15.12.3:5/5::gentoo
Date: Tue, 09 Aug 2016 21:14:21
Message-Id: 1508580f-5702-66a2-5d89-b0a572595333@verizon.net
In Reply to: Re: [gentoo-user] kde-apps/kde-l10n-16.04.3:5/5::gentoo conflicting with kde-apps/kdepim-l10n-15.12.3:5/5::gentoo by Michael Mol
1 On 08/09/2016 01:41 PM, Michael Mol wrote:
2 > On Tuesday, August 09, 2016 01:23:57 PM james wrote:
3 >> On 08/09/2016 09:17 AM, Michael Mol wrote:
4 >>> On Tuesday, August 09, 2016 09:13:31 AM james wrote:
5 >>>> On 08/09/2016 07:42 AM, Michael Mol wrote:
6 >>>>> On Monday, August 08, 2016 10:45:09 PM Alan McKinnon wrote:
7 >>>>>> On 08/08/2016 19:20, Michael Mol wrote:
8 >>>>>>> On Monday, August 08, 2016 06:52:15 PM Alan McKinnon wrote:
9 >>>>>>>> On 08/08/2016 17:02, Michael Mol wrote:
10 >
11 >>> I use Zabbix extensively at work, and have the Zabbix agent on my
12 >>> workstation reporting back various supported metrics. There's a great
13 >>> deal you can use (and--my favorite--abuse) Zabbix for, especially once
14 >>> you understand how it thinks.
15 >>
16 >> Congradualtions! Of the net-analyzer crowd, you've manage to find one I
17 >> have not spent time with........
18 >
19 > Oh, man, are you in for a treat. I recently had a conversation with a guy I
20 > happened to sit next to while traveling about how, were I in his position, I'd
21 > improve his cash crop and hydroponics operations (he periodically tests soil
22 > and sunlight properties) continually using a combination of cheap, custom
23 > probes and SBCs, feeding the data into Zabbix for monitoring and trend
24 > analysis / prediction. Zabbix will do time-series graphing and analysis of
25 > arbitrary input data; it may have been designed for watching interface
26 > counters, but there's no reason it need be limited to that...
27
28 Not sure of your tendencies, but yea, I tend to be more hardware and EE
29 oriented, than CS. Yep, I spent too many years with time-sequenced data
30 (turds) to not be totally excited about what we can now do with
31 clusters, analog (16 bit+) IO and enough processors and memory to keep
32 a simulation going and in RT(color). You sure know how to instigate an
33 itch.....
34
35 Besides, as I transcend retirement, I'm looking for greener pastures
36 and methodologies to enhance da(tm) dream state ......
37 (thx)
38
39
40 >>>> Any specific kernel tweaks?
41 >>>
42 >>> Most of my tweaks for KDE revolved around tuning mysqld itself. But for
43 >>> sysctls improving workstation responsiveness as it relates to memory
44 >>> interactions with I/O, these are my go-tos:
45 >>>
46 >>>
47 >>>
48 >>> vm.dirty_background_bytes = 1048576
49 >>> vm.dirty_bytes = 10485760
50 >>> vm.swappiness = 0
51 >>
52 >> Mine are::
53 >> cat dirty_bytes
54 >> 0
55 >> cat dirty_background_bytes
56 >> 0
57 >
58 > So, that means you have vm.dirty_bytes_ratio and vm.dirty_background_ratio
59 > set, instead. I forget what those default to, but I think
60 > dirty_bacgkround_ratio defaults to something like 10, which means *10%* of
61 > your memory may get used for buffering disk I/O before it starts writing data
62 > to disk. dirty_bytes_ratio will necessarily be higher, which means that if
63 > you're performing seriously write-intensive activities on a system with 32GiB
64 > of RAM, you may find yourself with a system that will halt until it finishes
65 > flushing 3+GiB of data to disk.
66 >
67 >> cat swappiness
68 >> 60
69 >
70 > Yeah, you want that set to lower than that.
71 >
72 >>
73 >>> vm.dirty_background_bytes ensures that any data (i.e. from mmap or
74 >>> fwrite, not from swapping) waiting to be written to disk *starts*
75 >>> getting written to disk once you've got at least the configured amount
76 >>> (1MB) of data waiting. (If you've got a disk controller with
77 >>> battery-backed or flash-backed write cache, you might consider
78 >>> increasing this to some significant fraction of your write cache. I.e.
79 >>> if you've got a 1GB FBWC with 768MB of that dedicated to write cache,
80 >>> you might set this to 512MB or so. Depending on your workload. I/O
81 >>> tuning is for those of us who enjoy the dark arts.)
82 >>>
83 >>>
84 >>> vm.dirty_bytes says that once you've got the configured amount (10MB) of
85 >>> data waiting to be disk, then no more asynchronous I/O is permitted
86 >>> until you have no more data waiting; all outstanding writes must be
87 >>> finished first. (My rule of thumb is to have this between 2-10 times the
88 >>> value of vm.dirty_background_bytes. Though I'm really trying to avoid it
89 >>> being high enough that it could take more than 50ms to transfer to disk;
90 >>> that way, any stalls that do happen are almost imperceptible.)
91 >>>
92 >>>
93 >>>
94 >>> You want vm.dirty_background_bytes to be high enough that your hardware
95 >>> doesn't spend its time powered on if it doesn't have to be, and so that
96 >>> your hardware can transfer data in large, efficient, streamable chunks.
97 >>>
98 >>>
99 >>>
100 >>> You want vm.dirty_bytes enough higher than your first number so that
101 >>> your hardware has enough time to spin up and transfer data before you
102 >>> put the hammer down and say, "all right, nobody else gets to queue
103 >>> writes until all the waiting data has reached disk."
104 >>>
105 >>> You want vm.dirty_bytes *low* enough that when you *do* have to put that
106 >>> hammer down, it doesn't interfere with your perceptions of a responsive
107 >>> system. (And in a server context, you want it low enough that things
108 >>> can't time out--or be pushed into timing out--waiting for it. Call your
109 >>> user attention a matter of timing out expecting things to respond to
110 >>> you, and the same principle applies...)
111 >>>
112 >>> Now, vm.swappiness? That's a weighting factor for how quickly the kernel
113 >>> should try moving memory to swap to be able to speedily respond to new
114 >>> allocations. Me, I prefer the kernel to not preemptively move
115 >>> lesser-used data to swap, because that's going to be a few hundred
116 >>> megabytes worth of data all associated with one application, and it'll
117 >>> be a real drag when I switch back to the application I haven't used for
118 >>> half an hour. So I set vm.swappiness to 0, to tell the kernel to only
119 >>> move data to swap if it has no other alternative while trying to satisfy
120 >>> a new memory allocation request.
121 >>
122 >> OK, OK, OK. I need to read a bit about these. Any references or docs or
123 >> is the result of parsing out what is the least painful for a
124 >> workstation? I do not run any heavy databases on my workstation; they
125 >> are only there to hack on them. I test db centric stuff on domain
126 >> servers, sometimes with limited resources. I run lxde and I'm moving to
127 >> lxqt for workstations and humanoid (terminal) IO.
128 >
129 > https://www.kernel.org/doc/Documentation/sysctl/vm.txt
130 > https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/
131
132 Excellent docs, thx.
133
134 >> Do you set these differently for servers?
135 >
136 > On my servers, I keep these values similar, because I'd rather have a little
137 > bit lower throughput than risk a catastrophic cascade failure stemming from an
138 > I/O stall.
139 >
140 >>
141 >> Nodes in a cluster?
142 >
143 > Same story.
144 >
145 > The exception is my storage cluster, which has dirty_bytes much higher, as
146 > it's very solidly battery backed, so I can use its oodles of memory as a write
147 > cache, giving its kernel time to reorder writes and flush data to disk
148 > efficiently, and letting clients very rapidly return from write requests.
149
150 Are these TSdB (time series data) by chance?
151
152 OK, so have your systematically experimented with these parameter
153 settings, collected and correlated the data, domain (needs) specific ?
154
155 As unikernels collide with my work on building up minimized and
156 optimized linux clusters, my pathway forward is to use several small
157 clusters, where the codes/frameworks can be changed, even the
158 tweaked-tuned kernels and DFS and note the performance differences for
159 very specific domain solutions. My examples are quite similar to that
160 aforementioned flight sim above, but the ordinary and uncommon
161 workloads of regular admin (dev/ops) work is only a different domain.
162
163 Ideas on automating the exploration of these settings
164 (scripts/traces/keystores) are keenly of interest to me, just so you know.
165
166
167
168 >> I use OpenRC, just so you know. I also have a motherboard with IOMMU
169 >> that is currently has questionable settings in the kernel config file. I
170 >> cannot find consensus if/how IOMMU that affects IO with the Sata HD
171 >> devices versus mm mapped peripherals.... in the context of 4.x kernel
172 >> options. I'm trying very hard here to avoid a deep dive on these issues,
173 >> so trendy strategies are most welcome, as workstation and cluster node
174 >> optimizations are all I'm really working on atm.
175 >
176 > Honestly, I'd suggest you deep dive. An image once, with clarity, will last
177 > you a lot longer than ongoing fuzzy and trendy images from people whose
178 > hardware and workflow is likely to be different from yours.
179 >
180 > The settings I provided should be absolutely fine for most use cases. Only
181 > exception would be mobile devices with spinning rust, but those are getting
182 > rarer and rarer...
183
184 I did a quick test with games-arcade/xgalaga. It's an old, quirky game
185 with sporadic lag variations. On a workstation with 32G ram and (8) 4GHz
186 64bit cores, very lightly loaded, there is no reason for in game lag.
187 Your previous settings made it much better and quicker the vast majority
188 of the time; but not optimal (always responsive). Experiences tell me if
189 I can tweak a system so that that game stays responsive whilst the
190 application(s) mix is concurrently running then the quick
191 test+parameter settings is reasonably well behaved. So thats becomes a
192 baseline for further automated tests and fine tuning for a system under
193 study.
194
195
196 Perhaps Zabbix +TSdB can get me further down the pathway. Time
197 sequenced and analyzed data is over kill for this (xgalaga) test, but
198 those coalesced test-vectors will be most useful for me as I seek a
199 gentoo centric pathway for low latency clusters (on bare metal).
200
201 TIA,
202
203
204 James

Replies