Randomly Reordering an Array - Beware of Bias
So I came upon the following blog post and found that I (just as the author has done) have been using a naive algorithm for randomly reordering arrays (mostly notably used when shuffling a deck of cards):
I agree I have seen this simple iterate-and-swap reordering algorithm all over the place, even in my education at college. I can think of a few applications that I am using it in right now. Here is some example C# code demonstrating the faulty algorithm:
Apparently, there is some bias introduced by this algorithm while reordering the array. For example, with a 3 item array of values [0, 1, 2], there are 27 outcomes, but only 6 unique permutations. Half of the permutations ([0, 2, 1], [1, 0, 2], and [1, 2, 0]) each appear 5 times in the 27 outcomes, while the other half of the permutations ([0, 1, 2], [2, 0, 1], and [2, 1, 0]) each appear only 4 times each. There is a slight bias toward half of the permutations in this example.
With a few modifications, this algorithm can be morphed into the Fisher-Yates Shuffle (also known as the Knuth Shuffle), which provides for an unbiased way to guarantee each permutation to be equally likely. Here is some example C# code showing the more correct reordering algorithm:
This algorithm has built into it some method for setting aside a growing set of already shuffled items as it iterates through the array. Perhaps it is this feature that eliminates the bias toward a certain subset of the permutations (I can’t really tell from my readings).
FOLLOWUP (01/05/09): I found this article today on the same subject, but with a lot more graphs and detailed explanations: The Danger of Naïveté