Tags: bias, example, excel, expectation, games, generator, istoo, microsoft, msdn, numbers, rand, random, software, statistical

Bias in rand for excel 07

On Microsoft » Microsoft Excel

16,419 words with 9 Comments; publish: Fri, 30 May 2008 20:21:00 GMT; (30662.50, « »)

I'm testing random numbers for games using excel 07 generator, but there is

too much bias away from statistical expectation. For example, when two

integer numbers are generated between 1 and 6, and subsequently added, the

results differ from the easily predictable numbers. Sevens are consistently

in short supply by about 1.5%. That's a lot! Any suggestions for better

randomization in excel would be very much appreciated.

yttrias

All Comments

Leave a comment...

  • 9 Comments
    • Here are my results from excel 2003

      11652

      21715

      31689

      41668

      51661

      61615

      Sub countrand()

      Dim numbers(6)

      For i = 1 To 6

      numbers(i) = 0

      Next i

      For i = 1 To 10000

      myRand = Int(6 * Rnd()) + 1

      numbers(myRand) = numbers(myRand) + 1

      Next i

      For i = 1 To 6

      Range("A" & i) = i

      Range("B" & i) = numbers(i)

      Next i

      End Sub

      "yttrias" wrote:

      > I'm testing random numbers for games using excel 07 generator, but there is

      > too much bias away from statistical expectation. For example, when two

      > integer numbers are generated between 1 and 6, and subsequently added, the

      > results differ from the easily predictable numbers. Sevens are consistently

      > in short supply by about 1.5%. That's a lot! Any suggestions for better

      > randomization in excel would be very much appreciated.

      > --

      > yttrias

      #1; Fri, 30 May 2008 20:22:00 GMT
    • On Dec 16, 2:22 pm, "Bernard Liengme" <blien....excel.todaysummary.com.stfx.TRUENORTH.ca>

      wrote:[vbcol=seagreen]

      > And what was your sample size?

      > --

      > Bernard V Liengme

      > Microsoft Excel MVPwww.stfx.ca/people/bliengme

      > remove caps from email

      > "yttrias" <yttr....excel.todaysummary.com.discussions.microsoft.com> wrote in message

      > news:09EA2A85-AE22-4CB1-9537-0EC3ACAF7DC1.excel.todaysummary.com.microsoft.com...

      The flaw in the MS random number generator has been known for some

      time. A description is included in this MS info sheet:

      http://support.microsoft.com/kb/829208

      I use a simulation package that has it's own generator. But there are

      probably others you can download via a web search.

      You may want to run statistical tests on ones that you are considering

      for validation.

      SteveM

      #2; Fri, 30 May 2008 20:23:00 GMT
    • yttrias -

      Which "excel 07 generator" are you using?

      RAND() worksheet function, or

      RND function in VBA, or

      Excel 2007, Data ribbon, Data Analysis, Random Number Generation tool ?

      - Mike Middleton

      http://www.DecisionToolworks.com

      Decision Analysis Add-ins for Excel

      "yttrias" <yttrias.excel.todaysummary.com.discussions.microsoft.com> wrote in message

      news:09EA2A85-AE22-4CB1-9537-0EC3ACAF7DC1.excel.todaysummary.com.microsoft.com...

      > I'm testing random numbers for games using excel 07 generator, but there

      > is

      > too much bias away from statistical expectation. For example, when two

      > integer numbers are generated between 1 and 6, and subsequently added, the

      > results differ from the easily predictable numbers. Sevens are

      > consistently

      > in short supply by about 1.5%. That's a lot! Any suggestions for better

      > randomization in excel would be very much appreciated.

      > --

      > yttrias

      #3; Fri, 30 May 2008 20:24:00 GMT
    • I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000

      consecutive values at 20 different locations throughout the range of 1M. The

      average of those 20 windows in the 1M was 1587, where the expected average

      should be 1667.

      yttrias

      "Bernard Liengme" wrote:

      > And what was your sample size?

      > --

      > Bernard V Liengme

      > Microsoft Excel MVP

      > www.stfx.ca/people/bliengme

      > remove caps from email

      > "yttrias" <yttrias.excel.todaysummary.com.discussions.microsoft.com> wrote in message

      > news:09EA2A85-AE22-4CB1-9537-0EC3ACAF7DC1.excel.todaysummary.com.microsoft.com...

      >

      >

      #4; Fri, 30 May 2008 20:25:00 GMT
    • I don't know about the OP, but Joel's posted code used the VBA Rnd()

      function. Prior to 2007, there were 3 distinct random number generators, the

      worksheet function RAND(), VBA Rnd(), and the ATP random number generator.

      None were very good before 2003, but their problems would arise as high order

      autocorrelations between "random" numbers not as the kind of problems the

      Joel was testing for. It is unlikely to show up in the process that the OP

      seems to describe. Therefore I would want to know how the OP generated

      "integer numbers between 1 and 6" and what the OP was using as expected

      values.

      A decent, but not great algorithm was implemented for worksheet RAND()

      (requires SP1 patch to work properly) and carried into 2007. I have seen no

      evidence that VBA Rnd() has ever been upgraded. RANDBETWEEN() became a

      worksheet function in 2007, and therefore probably switched from using ATP

      random number generator to using the worksheet RAND() function, though I have

      seen no confirmation of this.

      Joel's simulation shows departures from expectation that are NOT

      statistically significant. Generating 10000 sums of 2 values using worksheet

      RAND() and expected values of (6-ABS(7-tot2))/36, I tried several reps in

      2003 with no statistically significant departure from expectation. The

      following VBA code does 10 reps of 10000 using VBA Rnd(), and only one rep

      was statistically different from expectation (again in 2003). Therefore, I

      am somewhat skeptical of the OP's claims until more information is provided.

      Jerry

      Sub tryit()

      Application.ScreenUpdating = False ' to speed execution

      For j = 1 To 10

      For i = 1 To 10000

      x = Fix(Rnd() * 6) + Fix(Rnd() * 6) + 2 ' sum of two random

      integers between 1 and 6

      Cells(x, j) = Cells(x, j) + 1

      Next i

      Next j

      Application.ScreenUpdating = True

      End Sub

      "Mike Middleton" wrote:

      > yttrias -

      > Which "excel 07 generator" are you using?

      > RAND() worksheet function, or

      > RND function in VBA, or

      > Excel 2007, Data ribbon, Data Analysis, Random Number Generation tool ?

      > - Mike Middleton

      > http://www.DecisionToolworks.com

      > Decision Analysis Add-ins for Excel

      >

      > "yttrias" <yttrias.excel.todaysummary.com.discussions.microsoft.com> wrote in message

      > news:09EA2A85-AE22-4CB1-9537-0EC3ACAF7DC1.excel.todaysummary.com.microsoft.com...

      >

      >

      #5; Fri, 30 May 2008 20:26:00 GMT
    • You are still providing too little information for us to understand what you

      are doing.

      There is no RAND() function in VBA. In VBA, you either used the VBA Rnd()

      function, or the worksheet RAND() function via Evaluate("RAND()").

      As noted in my earlier post, in 2003 the worksheet RAND() function was far

      superior to the VBA Rnd(). I have seen no evidence that either function

      changed between 2003 and 2007.

      Your original post discussed the sum of two (presumably independent) random

      integers between 1 and 6 (such as the total from rolling two fair dice). The

      expected value of a single such sum would be 7. It is unclear how that

      relates to either your observed average of 1587 or expected average of 1667

      for an "average of 20 windows of 40,000 consecutive values.

      It sounds as though your VBA would involve only a few lines of code. If you

      paste that code into your reply, we can see exactly what you did.

      Jerry

      "yttrias" wrote:

      [vbcol=seagreen]

      > I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000

      > consecutive values at 20 different locations throughout the range of 1M. The

      > average of those 20 windows in the 1M was 1587, where the expected average

      > should be 1667.

      > --

      > yttrias

      >

      > "Bernard Liengme" wrote:

      #6; Fri, 30 May 2008 20:27:00 GMT
    • >I ran 1 million samples using RAND in VBA in excel 07. Then I sampled

      >40,000

      > consecutive values at 20 different locations throughout the range of 1M.

      > The

      > average of those 20 windows in the 1M was 1587, where the expected average

      > should be 1667.

      How are you using th Randomize statement? If you are executing it more than

      one time, that could be skewing your results. Consider this posting I have

      posted in the past...

      The Randomize statement should not be executed more than once for the

      lifespan of the object that is executing your code. Doing so more often

      actually makes the data less random than if Randomize is only run once. For

      so few colors, and the probable use the OP wants to use the Rnd function

      for, it will more than likely not matter here, but it is the concept that is

      important to know. I can't demo it within Excel because I am not aware of an

      available object where I can set the color of individual pixels (like a

      PictureBox control in the compiled version of VB for those who have worked

      with that language); however, I do have a VB program that demonstrates this

      fact visually.

      For those of you having access to the compiled versions of VB5 or VB6, here

      is posting I have offered over in the compiled VB newsgroups in the past

      that demonstrates this fact...

      Running Randomize multiple times ends up producing a less random set of

      numbers. To see the problem visually, use this code (which is a modification

      of a routine Bob Butler once posted). Start a new project and put two

      PictureBox'es on your Form (use the default names for everything and

      placement of the PictureBox'es is not important). Paste the following code

      into the Form's code window. The PictureBox on the left results from using

      Randomize only once, the one on the right uses it repeatedly. Both

      PictureBox displays are produced from the same looping code with the only

      difference being the use of the Randomize statement Ignoring the pronounce

      vertical areas (not sure what that is, probably some kind of boundary

      rounding problem), for which one does the distribution of colors look more

      "random"?

      Rick

      Const SCALESIZE = 3

      Private Sub Form_Load()

      Picture1.ScaleMode = 3

      Picture2.ScaleMode = 3

      Randomize

      Picture1.Move 0, 0, _

      128 * Screen.TwipsPerPixelX * SCALESIZE, _

      128 * Screen.TwipsPerPixelY * SCALESIZE

      Picture2.Move Picture1.Width, 0, _

      128 * Screen.TwipsPerPixelX * SCALESIZE, _

      128 * Screen.TwipsPerPixelY * SCALESIZE

      Me.Width = 2.02 * Picture1.Width

      Me.Height = 1.1 * Picture1.Height

      End Sub

      Private Sub Picture1_Paint()

      Dim i As Long

      Dim j As Long

      Dim colr As Long

      Dim bitmask As Long

      For i = 0 To Picture1.ScaleHeight Step SCALESIZE

      For j = 0 To Picture1.ScaleWidth Step SCALESIZE

      colr = Rnd * 16711680

      Picture1.Line (j, i)-Step(SCALESIZE, _

      SCALESIZE), colr, BF

      Next j

      Next i

      For i = 0 To Picture2.ScaleHeight Step SCALESIZE

      For j = 0 To Picture2.ScaleWidth Step SCALESIZE

      Randomize

      colr = Rnd * 16711680

      Picture2.Line (j, i)-Step(SCALESIZE, _

      SCALESIZE), colr, BF

      Next j

      Next i

      End Sub

      #7; Fri, 30 May 2008 20:28:00 GMT
    • "Rick Rothstein (MVP - VB)" wrote in message

      > For those of you having access to the compiled versions of VB5 or VB6,

      here

      > is posting I have offered over in the compiled VB newsgroups in the past

      > that demonstrates this fact...

      > Running Randomize multiple times ends up producing a less random set of

      > numbers. To see the problem visually, use this code (which is a

      modification

      > of a routine Bob Butler once posted). Start a new project and put two

      > PictureBox'es on your Form (use the default names for everything and

      > placement of the PictureBox'es is not important). Paste the following code

      > into the Form's code window. The PictureBox on the left results from using

      > Randomize only once, the one on the right uses it repeatedly. Both

      > PictureBox displays are produced from the same looping code with the only

      > difference being the use of the Randomize statement Ignoring the

      pronounce

      > vertical areas (not sure what that is, probably some kind of boundary

      > rounding problem), for which one does the distribution of colors look more

      > "random"?

      > Rick

      In particular this bit -

      > Ignoring the pronounced vertical areas (not sure what that is,

      > probably some kind of boundary rounding problem)

      Is it some rounding problem or could it be related to the non random nature

      of random, even with the single Randomize.

      Actually I didn't notice those vertical areas you refer to at first in the

      left box, however I amended a couple of values in your code, in particular

      changed colr = Rnd * 16711680 to colr = Rnd * 16777215, also increased the

      sizes by 1 pixel to exaggerate.

      Now I get very distinct not entirely random vertical areas in the left box.

      For the second box I changed Randomize in each loop to Randomize 1, just for

      fun.

      Amended code -

      Const SCALESIZE = 3

      Private Sub Form_Load()

      Const Z As Long = 129 ' original 128

      Me.Left = 10

      Picture1.ScaleMode = 3

      Picture2.ScaleMode = 3

      Randomize

      Picture1.Move 0, 0, _

      Z * Screen.TwipsPerPixelX * SCALESIZE, _

      Z * Screen.TwipsPerPixelY * SCALESIZE

      Picture2.Move Picture1.Width, 0, _

      Z * Screen.TwipsPerPixelX * SCALESIZE, _

      Z * Screen.TwipsPerPixelY * SCALESIZE

      Me.Width = 2.02 * Picture1.Width

      Me.Height = 1.1 * Picture1.Height

      End Sub

      Private Sub Picture1_Paint()

      Dim i As Long

      Dim j As Long

      Dim colr As Long

      Dim bitmask As Long

      Const C As Long = 16777215 ' original 16711680

      For i = 0 To Picture1.ScaleHeight Step SCALESIZE

      For j = 0 To Picture1.ScaleWidth Step SCALESIZE

      colr = Rnd * C

      Picture1.Line (j, i)-Step(SCALESIZE, _

      SCALESIZE), colr, BF

      Next j

      Next i

      For i = 0 To Picture2.ScaleHeight Step SCALESIZE

      For j = 0 To Picture2.ScaleWidth Step SCALESIZE

      Randomize ' original

      Randomize 1

      colr = Rnd * C

      Picture2.Line (j, i)-Step(SCALESIZE, _

      SCALESIZE), colr, BF

      Next j

      Next i

      End Sub

      Regards,

      Peter T

      #8; Fri, 30 May 2008 20:29:00 GMT
    • A similar demo for a VBA Userform -

      Option Explicit

      Private Declare Function FindWindow Lib "user32" _

      Alias "FindWindowA" _

      (ByVal lpClassName As String, _

      ByVal lpWindowName As String) As Long

      Private Declare Function SetPixel Lib "gdi32" _

      (ByVal hdc As Long, ByVal x As Long, _

      ByVal y As Long, ByVal crColor As Long) As Long

      Private Declare Function GetPixel Lib "gdi32" _

      (ByVal hdc As Long, ByVal x As Long, ByVal y As Long) As Long

      Private Declare Function GetDC Lib "user32" ( _

      ByVal hwnd As Long) As Long

      Private Declare Function ReleaseDC Lib "user32" ( _

      ByVal hwnd As Long, ByVal hdc As Long) As Long

      Private Sub PaintPixels()

      Dim hwnd As Long, hdc As Long

      Dim tp As Long, lt As Long

      Dim x As Long, y As Long

      Dim colr As Long

      Const Z As Long = 128 * 2 - 1

      Const C As Long = 16777215

      Me.Left = 10: Me.Top = 10

      Me.Width = (Z * 2) * 0.75 + 45: Me.Height = Z * 0.75 + 60

      ' if form is too small, change 0.75 to 1 or 1.25

      hwnd = FindWindow("ThunderDFrame", Me.Caption)

      hdc = GetDC(hwnd)

      tp = Me.Top + 15

      lt = Me.Left + 10

      Randomize

      For y = tp To tp + Z

      For x = lt To lt + Z

      colr = Rnd * C

      SetPixel hdc, x, y, colr

      Next

      Next

      lt = lt + Z + 15

      For y = tp To tp + Z

      For x = lt To lt + Z

      Randomize 1

      colr = Rnd * C

      SetPixel hdc, x, y, colr

      Next

      Next

      ReleaseDC hwnd, hdc

      End Sub

      Private Sub UserForm_Activate()

      Me.Caption = "Click me to (re-) PaintPixels"

      PaintPixels

      End Sub

      Private Sub UserForm_Click()

      ' Me.Repaint

      PaintPixels

      End Sub

      Peter T

      #9; Fri, 30 May 2008 20:30:00 GMT