1 |
On 08/09/2016 01:41 PM, Michael Mol wrote: |
2 |
> On Tuesday, August 09, 2016 01:23:57 PM james wrote: |
3 |
>> On 08/09/2016 09:17 AM, Michael Mol wrote: |
4 |
>>> On Tuesday, August 09, 2016 09:13:31 AM james wrote: |
5 |
>>>> On 08/09/2016 07:42 AM, Michael Mol wrote: |
6 |
>>>>> On Monday, August 08, 2016 10:45:09 PM Alan McKinnon wrote: |
7 |
>>>>>> On 08/08/2016 19:20, Michael Mol wrote: |
8 |
>>>>>>> On Monday, August 08, 2016 06:52:15 PM Alan McKinnon wrote: |
9 |
>>>>>>>> On 08/08/2016 17:02, Michael Mol wrote: |
10 |
> |
11 |
>>> I use Zabbix extensively at work, and have the Zabbix agent on my |
12 |
>>> workstation reporting back various supported metrics. There's a great |
13 |
>>> deal you can use (and--my favorite--abuse) Zabbix for, especially once |
14 |
>>> you understand how it thinks. |
15 |
>> |
16 |
>> Congradualtions! Of the net-analyzer crowd, you've manage to find one I |
17 |
>> have not spent time with........ |
18 |
> |
19 |
> Oh, man, are you in for a treat. I recently had a conversation with a guy I |
20 |
> happened to sit next to while traveling about how, were I in his position, I'd |
21 |
> improve his cash crop and hydroponics operations (he periodically tests soil |
22 |
> and sunlight properties) continually using a combination of cheap, custom |
23 |
> probes and SBCs, feeding the data into Zabbix for monitoring and trend |
24 |
> analysis / prediction. Zabbix will do time-series graphing and analysis of |
25 |
> arbitrary input data; it may have been designed for watching interface |
26 |
> counters, but there's no reason it need be limited to that... |
27 |
|
28 |
Not sure of your tendencies, but yea, I tend to be more hardware and EE |
29 |
oriented, than CS. Yep, I spent too many years with time-sequenced data |
30 |
(turds) to not be totally excited about what we can now do with |
31 |
clusters, analog (16 bit+) IO and enough processors and memory to keep |
32 |
a simulation going and in RT(color). You sure know how to instigate an |
33 |
itch..... |
34 |
|
35 |
Besides, as I transcend retirement, I'm looking for greener pastures |
36 |
and methodologies to enhance da(tm) dream state ...... |
37 |
(thx) |
38 |
|
39 |
|
40 |
>>>> Any specific kernel tweaks? |
41 |
>>> |
42 |
>>> Most of my tweaks for KDE revolved around tuning mysqld itself. But for |
43 |
>>> sysctls improving workstation responsiveness as it relates to memory |
44 |
>>> interactions with I/O, these are my go-tos: |
45 |
>>> |
46 |
>>> |
47 |
>>> |
48 |
>>> vm.dirty_background_bytes = 1048576 |
49 |
>>> vm.dirty_bytes = 10485760 |
50 |
>>> vm.swappiness = 0 |
51 |
>> |
52 |
>> Mine are:: |
53 |
>> cat dirty_bytes |
54 |
>> 0 |
55 |
>> cat dirty_background_bytes |
56 |
>> 0 |
57 |
> |
58 |
> So, that means you have vm.dirty_bytes_ratio and vm.dirty_background_ratio |
59 |
> set, instead. I forget what those default to, but I think |
60 |
> dirty_bacgkround_ratio defaults to something like 10, which means *10%* of |
61 |
> your memory may get used for buffering disk I/O before it starts writing data |
62 |
> to disk. dirty_bytes_ratio will necessarily be higher, which means that if |
63 |
> you're performing seriously write-intensive activities on a system with 32GiB |
64 |
> of RAM, you may find yourself with a system that will halt until it finishes |
65 |
> flushing 3+GiB of data to disk. |
66 |
> |
67 |
>> cat swappiness |
68 |
>> 60 |
69 |
> |
70 |
> Yeah, you want that set to lower than that. |
71 |
> |
72 |
>> |
73 |
>>> vm.dirty_background_bytes ensures that any data (i.e. from mmap or |
74 |
>>> fwrite, not from swapping) waiting to be written to disk *starts* |
75 |
>>> getting written to disk once you've got at least the configured amount |
76 |
>>> (1MB) of data waiting. (If you've got a disk controller with |
77 |
>>> battery-backed or flash-backed write cache, you might consider |
78 |
>>> increasing this to some significant fraction of your write cache. I.e. |
79 |
>>> if you've got a 1GB FBWC with 768MB of that dedicated to write cache, |
80 |
>>> you might set this to 512MB or so. Depending on your workload. I/O |
81 |
>>> tuning is for those of us who enjoy the dark arts.) |
82 |
>>> |
83 |
>>> |
84 |
>>> vm.dirty_bytes says that once you've got the configured amount (10MB) of |
85 |
>>> data waiting to be disk, then no more asynchronous I/O is permitted |
86 |
>>> until you have no more data waiting; all outstanding writes must be |
87 |
>>> finished first. (My rule of thumb is to have this between 2-10 times the |
88 |
>>> value of vm.dirty_background_bytes. Though I'm really trying to avoid it |
89 |
>>> being high enough that it could take more than 50ms to transfer to disk; |
90 |
>>> that way, any stalls that do happen are almost imperceptible.) |
91 |
>>> |
92 |
>>> |
93 |
>>> |
94 |
>>> You want vm.dirty_background_bytes to be high enough that your hardware |
95 |
>>> doesn't spend its time powered on if it doesn't have to be, and so that |
96 |
>>> your hardware can transfer data in large, efficient, streamable chunks. |
97 |
>>> |
98 |
>>> |
99 |
>>> |
100 |
>>> You want vm.dirty_bytes enough higher than your first number so that |
101 |
>>> your hardware has enough time to spin up and transfer data before you |
102 |
>>> put the hammer down and say, "all right, nobody else gets to queue |
103 |
>>> writes until all the waiting data has reached disk." |
104 |
>>> |
105 |
>>> You want vm.dirty_bytes *low* enough that when you *do* have to put that |
106 |
>>> hammer down, it doesn't interfere with your perceptions of a responsive |
107 |
>>> system. (And in a server context, you want it low enough that things |
108 |
>>> can't time out--or be pushed into timing out--waiting for it. Call your |
109 |
>>> user attention a matter of timing out expecting things to respond to |
110 |
>>> you, and the same principle applies...) |
111 |
>>> |
112 |
>>> Now, vm.swappiness? That's a weighting factor for how quickly the kernel |
113 |
>>> should try moving memory to swap to be able to speedily respond to new |
114 |
>>> allocations. Me, I prefer the kernel to not preemptively move |
115 |
>>> lesser-used data to swap, because that's going to be a few hundred |
116 |
>>> megabytes worth of data all associated with one application, and it'll |
117 |
>>> be a real drag when I switch back to the application I haven't used for |
118 |
>>> half an hour. So I set vm.swappiness to 0, to tell the kernel to only |
119 |
>>> move data to swap if it has no other alternative while trying to satisfy |
120 |
>>> a new memory allocation request. |
121 |
>> |
122 |
>> OK, OK, OK. I need to read a bit about these. Any references or docs or |
123 |
>> is the result of parsing out what is the least painful for a |
124 |
>> workstation? I do not run any heavy databases on my workstation; they |
125 |
>> are only there to hack on them. I test db centric stuff on domain |
126 |
>> servers, sometimes with limited resources. I run lxde and I'm moving to |
127 |
>> lxqt for workstations and humanoid (terminal) IO. |
128 |
> |
129 |
> https://www.kernel.org/doc/Documentation/sysctl/vm.txt |
130 |
> https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/ |
131 |
|
132 |
Excellent docs, thx. |
133 |
|
134 |
>> Do you set these differently for servers? |
135 |
> |
136 |
> On my servers, I keep these values similar, because I'd rather have a little |
137 |
> bit lower throughput than risk a catastrophic cascade failure stemming from an |
138 |
> I/O stall. |
139 |
> |
140 |
>> |
141 |
>> Nodes in a cluster? |
142 |
> |
143 |
> Same story. |
144 |
> |
145 |
> The exception is my storage cluster, which has dirty_bytes much higher, as |
146 |
> it's very solidly battery backed, so I can use its oodles of memory as a write |
147 |
> cache, giving its kernel time to reorder writes and flush data to disk |
148 |
> efficiently, and letting clients very rapidly return from write requests. |
149 |
|
150 |
Are these TSdB (time series data) by chance? |
151 |
|
152 |
OK, so have your systematically experimented with these parameter |
153 |
settings, collected and correlated the data, domain (needs) specific ? |
154 |
|
155 |
As unikernels collide with my work on building up minimized and |
156 |
optimized linux clusters, my pathway forward is to use several small |
157 |
clusters, where the codes/frameworks can be changed, even the |
158 |
tweaked-tuned kernels and DFS and note the performance differences for |
159 |
very specific domain solutions. My examples are quite similar to that |
160 |
aforementioned flight sim above, but the ordinary and uncommon |
161 |
workloads of regular admin (dev/ops) work is only a different domain. |
162 |
|
163 |
Ideas on automating the exploration of these settings |
164 |
(scripts/traces/keystores) are keenly of interest to me, just so you know. |
165 |
|
166 |
|
167 |
|
168 |
>> I use OpenRC, just so you know. I also have a motherboard with IOMMU |
169 |
>> that is currently has questionable settings in the kernel config file. I |
170 |
>> cannot find consensus if/how IOMMU that affects IO with the Sata HD |
171 |
>> devices versus mm mapped peripherals.... in the context of 4.x kernel |
172 |
>> options. I'm trying very hard here to avoid a deep dive on these issues, |
173 |
>> so trendy strategies are most welcome, as workstation and cluster node |
174 |
>> optimizations are all I'm really working on atm. |
175 |
> |
176 |
> Honestly, I'd suggest you deep dive. An image once, with clarity, will last |
177 |
> you a lot longer than ongoing fuzzy and trendy images from people whose |
178 |
> hardware and workflow is likely to be different from yours. |
179 |
> |
180 |
> The settings I provided should be absolutely fine for most use cases. Only |
181 |
> exception would be mobile devices with spinning rust, but those are getting |
182 |
> rarer and rarer... |
183 |
|
184 |
I did a quick test with games-arcade/xgalaga. It's an old, quirky game |
185 |
with sporadic lag variations. On a workstation with 32G ram and (8) 4GHz |
186 |
64bit cores, very lightly loaded, there is no reason for in game lag. |
187 |
Your previous settings made it much better and quicker the vast majority |
188 |
of the time; but not optimal (always responsive). Experiences tell me if |
189 |
I can tweak a system so that that game stays responsive whilst the |
190 |
application(s) mix is concurrently running then the quick |
191 |
test+parameter settings is reasonably well behaved. So thats becomes a |
192 |
baseline for further automated tests and fine tuning for a system under |
193 |
study. |
194 |
|
195 |
|
196 |
Perhaps Zabbix +TSdB can get me further down the pathway. Time |
197 |
sequenced and analyzed data is over kill for this (xgalaga) test, but |
198 |
those coalesced test-vectors will be most useful for me as I seek a |
199 |
gentoo centric pathway for low latency clusters (on bare metal). |
200 |
|
201 |
TIA, |
202 |
|
203 |
|
204 |
James |