Up a LevelClick the arrow to go up a level to the Math Index.
Follow the above link or click the graphic below to visit the Homepage.

HomepageRandom Normal
Number Generator
Jim Cullen



There are several approaches to the problem of generating a random series of numbers that are distributed normally, some better than others. There are random lookup tables, transformation functions, and other methods even more sophisticated. A few will be discussed here as time permits.

The normal distribution, sometimes referred to as the "Bell Curve", is one of the key concepts of the Central Limit Theorem. The Normal Distribution describes data that is distributed normally, that is, randomly scattered but arranged around a central point as a graduated concentration. The equation for the Bell Curve is:

y = e ( -1/2 . x^2)


This form calculates the ordinates of the Normal Distribution with x being the distance, plus or minus, from the arithmetic mean expressed in terms of the standard deviation ( x / s ). The height of the ordinate is the percentage of the maximum ordinate which occurs at zero. In these terms, the equation describes the classic Bell Curve which has an arithmetic mean of zero and a standard deviation of one. In this form of the equation, the height of the ordinate at zero is equal to one. In order to determine the theoretical frequency for any value of x, you simply multiply by a constant, 1 / sqrt( 2 . pi ), which works out to a decimal value of approximately 0.24197072. Without this constant, the frequency of an ordinate is expressed in terms of the frequency of the ordinate when x = 0.

It is difficult to generate a random series of numbers that approaches a Normal Distribution because typical random number generators in computer languages are linear values between zero and one. An ideal random linear distribution is such that the theoretical frequency of any ordinate is equal to all others. The Normal Distribution also extends to plus or minus infinity while random number generators have a fixed, finite range.

To convert the linear random numbers to a Normal Distribution, we may take advantage of the equation for the Normal Distribution to obtain the theoretical frequency of our random data. We'll also need to extend the range of our random numbers to include the tail areas of the Normal Distribution. Simple math will be used to extend and center our data set to any values required, providing what we will call 'candidate' random numbers. A second random number, we'll call it the 'key' random, will be generated to be compared to the ordinate value obtained from the Normal Distribution equation. If our key fits, if it is less than the theoretical frequency of our ordinate as compared to the frequency of the arithmetic mean, then our 'candidate' random is accepted as a data point.

There are advantages and disadvantages to this method. Many calls to the random generator may have to be made to obtain one data point. For most applications, unless you require many millions of normally distributed random numbers, this will not be a problem. In fact, a random number of calls to the generator before accepting a data point is an advantage since you bypass having to accept consecutive values from the mathematical routine that simulates random numbers in the computer. There is a problem also of extending our range to plus or minus infinity - it isn't practical in this method. As we extend the range of our 'candidate' random numbers, even more calls to the random generator are required to obtain an accepted data point. The more we extend our range, the slower the routine runs... but the results are more accurate. There is a tradeoff between accuracy and speed which will show up as a standard deviation of the final data set that is some tiny amount less than one.


URL: https://jcullen88.github.io/CullenGenealogyHomepage/Math2.html
© July 12, 2006 - Jim Cullen - all rights reserved.

Use your Back Button or click here to go to the Math Index