1 |
Mark Knecht posted on Sat, 22 Jun 2013 18:48:15 -0700 as excerpted: |
2 |
|
3 |
> Duncan, |
4 |
|
5 |
Again, following up now that it's my "weekend" and I have a chance... |
6 |
|
7 |
> Actually, using your idea of piping things to /dev/null it appears |
8 |
> that the random number generator itself is only capable of 15MB/S on my |
9 |
> machine. It doesn't change much based on block size of number of bytes |
10 |
> I pipe. |
11 |
|
12 |
=:^( |
13 |
|
14 |
Well, you tried. |
15 |
|
16 |
> If this speed is representative of how well that works then I think |
17 |
> I have to use a file. It appears this guy gets similar values: |
18 |
> |
19 |
> http://www.globallinuxsecurity.pro/quickly-fill-a-disk-with-random-bits- |
20 |
without-dev-urandom/ |
21 |
|
22 |
Wow, that's a very nice idea he has there! I'll have to remember that! |
23 |
The same idea should work for creating any relatively large random file, |
24 |
regardless of final use. Just crypt-setup the thing and dd /dev/zero |
25 |
into it. |
26 |
|
27 |
FWIW, you're doing better than my system does, however. I seem to run |
28 |
about 13 MB/s from /dev/urandom (upto 13.7 depending on blocksize). And |
29 |
back to the random vs urandom discussion, random totally blocked here |
30 |
after a few dozen bytes, waiting for more random data to be generated. |
31 |
So the fact that you actually got a usefully sized file out of it does |
32 |
indicate that you must have hardware random and that it's apparently |
33 |
working well. |
34 |
|
35 |
> On the other hand, piping /dev/zero appears to be very fast - |
36 |
> basically the speed of the processor I think: |
37 |
> |
38 |
> $ dd if=/dev/zero of=/dev/null bs=4096 count=$[1000] |
39 |
> 1000+0 records in 1000+0 records out 4096000 bytes (4.1 MB) copied, |
40 |
> 0.000622594 s, 6.6 GB/s |
41 |
|
42 |
What's most interesting to me when I tried that here is that unlike |
43 |
urandom, zero's output varies DRAMATICALLY by blocksize. With |
44 |
bs=$((1024*1024)) (aka 1MB), I get 14.3 GB/s, tho at the default bs=512, |
45 |
I get only 1.2 GB/s. (Trying a few more values, 1024*512 gives me very |
46 |
similar 14.5 GB/s, 1024*64 is already down to 13.2 GB/s, 1024*128=13.9 |
47 |
and 1024*256=14.1, while on the high side 1024*1024*2 is already down to |
48 |
10.2 GB/s. So quarter MB to one MB seems the ideal range, on my |
49 |
hardware.) |
50 |
|
51 |
But of course, if your device is compressible-data speed-sensitive, as |
52 |
are say the sandforce-controller-based ssds, /dev/zero isn't going to |
53 |
give you anything like the real-world benchmark random data would (tho it |
54 |
should be a great best-case compressible-data test). Tho it's unlikely |
55 |
to matter on most spinning rust, AFAIK, and SSDs like my Corsair Neutrons |
56 |
(Link_A_Media/LAMD-based controller), which have as a bullet-point |
57 |
feature that they're data compression agnostic, unlike the sandforce- |
58 |
based SSDs. |
59 |
|
60 |
Since /dev/zero is so fast, I'd probably do a few initial tests to |
61 |
determine whether compressible data makes a difference on what you're |
62 |
testing, then use /dev/zero if it doesn't appear to, to get a reasonable |
63 |
base config, then finally double-check that against random data again. |
64 |
|
65 |
Meanwhile, here's another idea for random data, seeing as /dev/urandom is |
66 |
speed limited. Upto your memory constraints anyway, you should be able |
67 |
to dd if=/dev/urandom of=/some/file/on/tmpfs . Then you can |
68 |
dd if=/tmpfs/file, of=/dev/test/target, or if you want a bigger file than |
69 |
a direct tmpfs file will let you use, try something like this: |
70 |
|
71 |
cat /tmpfs/file /tmpfs/file /tmpfs/file | dd of=/dev/test/target |
72 |
|
73 |
... which would give you 3X the data size of /tmpfs/file. |
74 |
|
75 |
(Man, testing that with a 10 GB tmpfs file (on a 12 GB tmpfs /tmp), I can |
76 |
see see how slow that 13 MB/s /dev/urandom actually is as I'm creating |
77 |
it! OUCH! I waited awhile before I started typing this comment... I've |
78 |
been typing slowly and looking at the usage graph as I type, and I'm |
79 |
still only at maybe 8 gigs, depending on where my cache usage was when I |
80 |
started, right now!) |
81 |
|
82 |
cd /tmp |
83 |
|
84 |
dd if=/dev/urandom of=/tmp/10gig.testfile bs=$((1024*1024)) count=10240 |
85 |
|
86 |
(10240 records, 10737418240 bytes, but it says 11 GB copied, I guess dd |
87 |
uses 10^3 multipliers, anyway, ~783 s, 13.7 MB/s) |
88 |
|
89 |
ls -l 10gig.testfile |
90 |
|
91 |
(confirm the size, 10737418240 bytes) |
92 |
|
93 |
cat 10gig.testfile 10gig.testfile 10gig.testfile \ |
94 |
10gig.testfile 10gig.testfile | dd of=/dev/null |
95 |
|
96 |
(that's 5x, yielding 50 GB power of 2, 104857600+0 records, 53687091200 |
97 |
bytes, ~140s, 385 MB/s at the default 512-byte blocksize) |
98 |
|
99 |
Wow, what a difference block size makes there, too! Trying the above cat/ |
100 |
dd with bs=$((1024*1024)) (1MB) yields ~30s, 1.8 GB/s! |
101 |
|
102 |
1GB block size (1024*1024*1024) yields about the same, 30s, 1.8 GB/s. |
103 |
|
104 |
LOL dd didn't like my idea to try a 10 GB buffer size! |
105 |
|
106 |
dd: memory exhausted by input buffer of size 10737418240 bytes (10 GiB) |
107 |
|
108 |
(No wonder, as that'd be 10GB in tmpfs/cache and a 10GB buffer, and I'm |
109 |
/only/ running 16 gigs RAM and no swap! But it won't take 2 GB either. |
110 |
Checking, looks like as my normal user I'm running a ulimit of 1-gig |
111 |
memory size, 2-gig virtual-size, so I'm sort of surprised it took the 1GB |
112 |
buffer... maybe that counts against virtual only or something? ) |
113 |
|
114 |
Low side again, ~90s, 599 MB/s @ 1KB (1024 byte) bs, already a dramatic |
115 |
improvement from the 140s 385 MB/s of the default 512-byte block. |
116 |
|
117 |
2KB bs yields 52s, 1 GB/s |
118 |
|
119 |
16KB bs yields 31s, 1.7 GB/s, near optimum already. |
120 |
|
121 |
High side again, 1024*1024*4 (4MB) bs appears to be best-case, just under |
122 |
29s, 1.9 GB/s. Going to 8MB takes another second, 1.8 GB/s again, which |
123 |
is not a big surprise given that the memory page size is 4MB, so that's |
124 |
an unsurprising peak performance point. |
125 |
|
126 |
FWIW, cat seems to run just over 100% single-core saturation while dd |
127 |
seems to run just under, @97% or so. |
128 |
|
129 |
Running two instances in parallel (using the peak 4MB block size, 1.9 GB/ |
130 |
s with a single run) seems to cut performance some, but not nearly in |
131 |
half. (I got 1.5 GB/s and 1.6 GB/s, but I started one then switched to a |
132 |
different terminal to start the other, so they only overlapped by maybe |
133 |
30s or so of the 35s on each.). |
134 |
|
135 |
OK, so that's all memory/cpu since neither end is actual storage, but |
136 |
that does give me a reasonable base against which to benchmark actual |
137 |
storage (rust or ssd), if I wished. |
138 |
|
139 |
What's interesting is that by, I guess pure coincidence, my 385 MB/s |
140 |
original 512-byte blocksize figure is reasonably close to what the SSD |
141 |
read benchmarks are with hddparm. IIRC the hdparm/ssd numbers were some |
142 |
higher, but not so much so (470 MB/sec I just tested). But the bus speed |
143 |
maxes out not /too/ far above that (500-600 MB/sec, theoretically 600 MB/ |
144 |
sec on SATA-600, but real world obviously won't /quite/ hit that, IIRC |
145 |
best numbers I've seen anywhere are 585 or so). |
146 |
|
147 |
So now I guess I send this and do some more testing of real device, now |
148 |
that you've provoked my curiosity and I have the 50 GB (mostly) |
149 |
pseudorandom file sitting in tmpfs already. Maybe I'll post those |
150 |
results later. |
151 |
|
152 |
-- |
153 |
Duncan - List replies preferred. No HTML msgs. |
154 |
"Every nonfree program has a lord, a master -- |
155 |
and if you use the program, he is your master." Richard Stallman |