]]>

Is that possible?]]>

-----

Look at the following example:

Suppose that you have a random source that works like a dice; it produce integers 1 to 6, and say that there is no statistical bias in this source. You should, as an example, generate a random number in the range 0-49.

If you roll the dice only once, the output will be 1-6 with equal probability,

and you can never obtain zero or 7-49, so the output is severely biased.

If you roll the dice twice and add, the output will be 2-12, and those numbers will not have the same probability.

If you continue and roll the dice 10 times, the output number range is 10-60, and if the sum is above 49, we subtract 50 to fold into the intended number range. Note that the relative probability of every number can be exactly calculated.

If we expand on this and roll the dice a thousand times, the resulting number range would be 1000-6000, with different probabilities, and after truncating down into our result number range, we get almost equal probability for all our 50 numbers!

So the frequency distribution of the input don't matter much, but it must be random and uncorrelated. The frequency distribution of the sum convergence towards a Gaussian distribution, and when we fold the distribution into a small number range, a flat distribution is produced. It is always like this!

There is much to do when it comes to improving the rate of convergence.

The folding-effect, that produce the equal probability in the population, works best if we mix completely unrelated distributions. It would be a good thing if the distribution would be reasonably flat also, but that is less important.

Consider this:

Take an ordered list of 0-49, and mix the numbers into a substitution table. Then proceed:

Say that you have a number in the 0-49 range. Then go through the substitution table, where

0->23 1->12 2-> 7 3->41 4->37 and so on

to obtain the number 23. You now add a dice roll to reach 24-29. You must subtract 50 if you go beyond the number range.

Then again apply the substitution table!

This have a much faster convergence compared to the original set-up. Note that there is no secrets inside the substitution table, all we need is a conversion unrelated to the ordered 1-6 dice roll.

This is a way of improving random numbers.

- You combine the digits into the maximum that your input can generate (99)

- You discard the rest of the range (100..255)

You can discard less if you have 10 bit input and can generate 3 digits (0..999 and discard only 1000..1023).

This is how to solve for a optimum solution:

- Your input is (0..255) so, in bits it is log (256)/log(2) = 8 (any log will do)

- Your range is log (10)/log(2) = 3.32

- The ratio of a digit to a byte is: 3.32/8=0.415

By adding (to itself) the number 0.415 we see how much we can use of the input, and how large the loss is:

Digits Fill Number of bytes 1 0.415 1 2 0.830 1 3 1.246 2 4 1.661 2 5 2.076 3 6 2.491 3 7 2.907 3 8 3.322 4 9 3.373 4 10 4.152 5 11 4.568 5 12 4.983 5

Since the input comes in bytes, using 3 bytes and convert into 7 digits produce a loss of (3.000-2.907)/7=0.013 bytes for each output digit. This obviously get better if we use larger numbers. If you use 5 bytes and generate 12 digits, the loss is only 0.0014 bytes/digit.

A compromise is usually easy to find.

]]>

unsigned int Random_conversion(Range) { unsigned int Range: unsigned int inc_scale,Rand_word,Done,count; if (Range==0){ trap("Range is zero in Random_conversion()"); } else { inc_scale=0xFFFFFFFF/Range; if (inc_scale==0){ inc_scale=1; } while (!Done){ Rand_word=int32_truerandomnumber(); count=Rand_word/inc_scale; Done=count<Range; } } return count; }

It is easy to convert a number from one range to another. The important conceptual part is that the conversion, in general, cannot be complete, so all input random numbers do not convert into an output number. This is due to that we prefer an exact conversion, where the output numbers also have a flat distribution (exactly). As an example, I convert a random number in the range [0..15] to the range [0..2].

It is easy so see how this should be done. We have 16 input possibilities of exactly equal probabillity, that shall be divided into three sets of 5 number each, and then there will be a last input number that cannot be converted.

If the input number is

The magic divisor "5", in the example above can be calculated by

Range = 3; Largest = 15; D = Largest / Range; Result = Range; while ( Result == Range) { Rand = read_random_number; Result = Rand / D; }

]]>