Gentoo Archives: gentoo-user

From: Kai Krakow <hurikhan77@×××××.com>
To: gentoo-user@l.g.o
Subject: [gentoo-user] Re: Re: OT: Mapping random numbers (PRNG)
Date: Sat, 28 Jun 2014 21:36:10
Message-Id: 0ko38b-2a.ln1@hurikhan77.spdns.de
In Reply to: Re: [gentoo-user] Re: OT: Mapping random numbers (PRNG) by Matti Nykyri
1 Matti Nykyri <matti.nykyri@×××.fi> schrieb:
2
3 >> On Jun 27, 2014, at 0:00, Kai Krakow <hurikhan77@×××××.com> wrote:
4 >>
5 >> Matti Nykyri <Matti.Nykyri@×××.fi> schrieb:
6 >>
7 >>> If you are looking a mathematically perfect solution there is a simple
8 >>> one even if your list is not in the power of 2! Take 6 bits at a time of
9 >>> the random data. If the result is 62 or 63 you will discard the data and
10 >>> get the next 6 bits. This selectively modifies the random data but keeps
11 >>> the probabilities in correct balance. Now the probability for index of
12 >>> 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0.
13 >>
14 >> Why not do just something like this?
15 >>
16 >> index = 0;
17 >> while (true) {
18 >> index = (index + get_6bit_random()) % 62;
19 >> output << char_array[index];
20 >> }
21 >>
22 >> Done, no bits wasted. Should have perfect distribution also. We also
23 >> don't have to throw away random data just to stay within unaligned
24 >> boundaries. The unalignment is being taken over into the next loop so the
25 >> "error" corrects itself over time (it becomes distributed over the whole
26 >> set).
27 >
28 > Distribution will not be perfect. The same original problem persists.
29 > Probability for index 0 to 1 will be 2/64 and for 2 to 61 it will be 1/64.
30 > Now the addition changes this so that index 0 to 1 reflects to previous
31 > character and not the original index.
32 >
33 > The distribution of like 10GB of data should be quite even but not on a
34 > small scale. The next char will depend on previous char. It is 100% more
35 > likely that the next char is the same or one index above the previous char
36 > then any of the other ones in the series. So it is likely that you will
37 > have long sets of same character.
38
39 I cannot follow your reasoning here - but I'd like to learn. Actually, I ran
40 this multiple times and never saw long sets of the same character, even no
41 short sets of the same character. The 0 or 1 is always rolled over into the
42 next random addition. I would only get sets of the same character if rand()
43 returned zero multiple times after each other - which wouldn't be really
44 random. ;-)
45
46 Keep in mind: The last index will be reused whenever you'd enter the
47 function - it won't reset to zero. But still that primitive implementation
48 had a flaw: It will tend to select characters beyond the current offset, if
49 it is >= 1/2 into the complete set, otherwise it will prefer selecting
50 characters before the offset.
51
52 In my tests I counted how ofter new_index > index and new_index < index, and
53 it had a clear bias for the first. So I added swapping of the selected index
54 with offset=0 in the set. Now the characters will be swapped and start to
55 distribute that flaw. The distribution, however, didn't change.
56
57 Of course I'm no mathematician, I don't know how I'd calculate the
58 probabilities for my implementation because it is sort of a recursive
59 function (for get_rand()) when looking at it over time:
60
61 int get_rand() {
62 static int index = 0;
63 return (index = (index + get_6bit_rand()) % 62);
64 }
65
66 char get_char() {
67 int index = get_rand();
68 char tmp = chars[index];
69 chars[index] = chars[0];
70 return (chars[0] = tmp);
71 }
72
73 However, get_char() should return evenly distributes results.
74
75 What this shows, is, that while distribution is even among the result set,
76 the implementation may still be flawed because results could be predictable
77 for a subset of results. Or in other words: Simply looking at the
78 distribution of results is not an indicator for randomness. I could change
79 get_rand() in the following way:
80
81 int get_rand() {
82 static int index = 0;
83 return (index = (index + 1) % 62);
84 }
85
86 Results would be distributed even, but clearly it is not random.
87
88 --
89 Replies to list only preferred.

Replies

Subject Author
Re: [gentoo-user] Re: Re: OT: Mapping random numbers (PRNG) "Canek Peláez Valdés" <caneko@×××××.com>
Re: [gentoo-user] Re: Re: OT: Mapping random numbers (PRNG) Matti Nykyri <matti.nykyri@×××.fi>