Gentoo Archives: gentoo-user

From: Matti Nykyri <Matti.Nykyri@×××.fi>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] OT: Mapping random numbers (PRNG)
Date: Fri, 06 Jun 2014 21:03:59
Message-Id: 20140606210329.GA1631@lyseo.edu.ouka.fi
In Reply to: Re: [gentoo-user] OT: Mapping random numbers (PRNG) by "Canek Peláez Valdés"
1 On Thu, Jun 05, 2014 at 10:58:51PM -0500, Canek Peláez Valdés wrote:
2 > On Thu, Jun 5, 2014 at 9:56 PM, <meino.cramer@×××.de> wrote:
3 > > Hi,
4 > >
5 > > I am experimenting with the C code of the ISAAC pseudo random number generator
6 > > (http://burtleburtle.net/bob/rand/isaacafa.html).
7 > >
8 > > Currently the implementation creates (on my embedded linux) 32 bit
9 > > hexadecimal output.
10 >
11 > So it's a 32 bit integer.
12 >
13 > > From this I want to create random numbers in the range of [a-Za-z0-9]
14 > > *without violating randomness* and (if possible) without throwing
15 > > away bits of the output.
16 >
17 > You mean *characters* int the range [A-Za-z0-9]?
18
19 Well this isn't as simple problem as it sounds. A random 32 bit integer
20 has 32 bits of randomness. If you take a divison reminder of 62 from this
21 integer you will get only 5,95419631039 bits of randomness
22 (log(62)/log(2)). So you are wasting 81,4% of your random data. Which is
23 quite much and usually random data is quite expensive. You can save your
24 precious random data by taking only 6 bit from your 32 bit integer and
25 dividing it by 62. Then you will be wasting only 0,8% of random data.
26 Another problem is alignment, but that is about mathematical correctness.
27
28 > > How can I do this mathemtically (in concern of the quality of output)
29 > > correct?
30 >
31 > The easiest thing to do would be:
32
33 The easiest is not mathematically correct though. Random data will stay
34 random only if you select and modify it so that randomness is preserved.
35 If you take devison reminder of 62 from 32 bit integer there are 69 273
36 667 possibilities of the reminder to be 3 or less. For the reminder to 4
37 or more the number of possibilities is 69 273 666. In mathematically
38 ideal case the probability for every index of the list should be same:
39 1/62 = 1,61290322581%. But the modulo 62 modifies this probability: for
40 index 0-3 the probability is 69 273 667/2^32 = 1,61290324759%. And for
41 indexes 4-61 the probability will be 69 273 666/2^32 = 1,6129032243%.
42
43 If you wish not to waste those random bits the probabilities will get
44 worse. With 6 bits of random the probability for index 0-1 will be 2/64
45 and for 2-63 it will be 1/64. This is a very significant change because
46 first and second index will appear twice as much as the rest. If you add
47 2 characters to your list you will perfect alignment and you can take 6
48 bits of data without it modifying probabilities.
49
50 If you are looking a mathematically perfect solution there is a simple
51 one even if your list is not in the power of 2! Take 6 bits at a time of
52 the random data. If the result is 62 or 63 you will discard the data and
53 get the next 6 bits. This selectively modifies the random data but keeps
54 the probabilities in correct balance. Now the probability for index of
55 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0.
56
57 > -------------------------------------------------------------------------------
58 > #include <time.h>
59 > #include <stdio.h>
60 > #include <stdlib.h>
61 >
62 > #define N (26+26+10)
63 >
64 > static char S[] = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
65 > 'K', 'L', 'M',
66 > 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
67 > 'X', 'Y', 'Z',
68 > 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
69 > 'k', 'l', 'm',
70 > 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
71 > 'x', 'y', 'z',
72 > '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
73 >
74 > int
75 > next_character()
76 > {
77 > // Use the correct call for ISAAC instead of rand()
78 > unsigned int idx = rand() % N;
79 > return S[idx];
80 > }
81
82 so modify the next_char function:
83
84 char next_character()
85 {
86 static unsigned int rand = 0; //(sizeof(int) = 32)
87 static char bit_avail = 0;
88 char result = 0;
89 char move_bits = 0;
90 char bits_moved = 0;
91
92 do {
93 if (!bits_avail) {
94 // Use the correct call for ISAAC instead of rand()
95 rand = rand();
96
97 bit_avail = 32;
98 }
99
100 move_bits = bits_avail >= 6 ? 6 : bits_avail;
101 result <<= move_bits;
102 result = (result | rand & (0xFF >> (8 - move_bits))) & 0x3F;
103 bits_avail -= move_bits;
104 bits_moved += move_bits;
105 rand >>= move_bits;
106
107 } while (bits_moved != 6 && result > 61);
108
109 return result;
110 }
111
112 This function will give perfect distribution of 1/62 probability for
113 every index. It will waste 6 bits with the probability of 1/32 (2/64).
114
115 > int
116 > main(int argc, char* argv[])
117 > {
118 > // Use the correct call for initializing the ISAAC seed
119 > srand((unsigned int)time(NULL));
120 > for (int i = 0; i < 20; i++) // --std=c99
121 > printf("%c\n", next_character());
122 > return 0;
123 > }
124 > -------------------------------------------------------------------------------
125 >
126 > If the ISAAC RNG has a good distribution, then the next_character()
127 > function will give a good distribution among the set [A-Za-z0-9].
128 >
129 > Unless I missunderstood what you meant with "create random numbers in
130 > the range of [a-Za-z0-9]".
131
132 --
133 -Matti

Replies

Subject Author
Re: [gentoo-user] OT: Mapping random numbers (PRNG) Matti Nykyri <Matti.Nykyri@×××.fi>
[gentoo-user] Re: OT: Mapping random numbers (PRNG) Kai Krakow <hurikhan77@×××××.com>