Skip to main content

4.1. Using Stream Parity to De-Skew

As a simple but not particularly practical example, consider taking a sufficiently long string of bits and mapping the string to "zero" or "one". The mapping will not yield a perfectly uniform distribution, but it can be as close as desired. One mapping that serves the purpose is to take the parity of the string. This has the advantages that it is robust across all degrees of skew up to the estimated maximum skew and that it is trivial to implement in hardware.

The following analysis gives the number of bits that must be sampled:

Suppose that the ratio of ones to zeros is ( 0.5 + E ) to ( 0.5 - E ), where E is between 0 and 0.5 and is a measure of the "eccentricity" of the distribution. Consider the distribution of the parity function of N bit samples. The respective probabilities that the parity will be one or zero will be the sum of the odd or even terms in the binomial expansion of (p + q)^N, where p = 0.5 + E, the probability of a one, and q = 0.5 - E, the probability of a zero.

These sums can be computed easily as

                    N            N
1/2 * ( ( p + q ) + ( p - q ) )

and

                    N            N
1/2 * ( ( p + q ) - ( p - q ) )

(Which formula corresponds to the probability that the parity will be 1 depends on whether N is odd or even.)

Since p + q = 1 and p - q = 2E, these expressions reduce to

                  N
1/2 * [1 + (2E) ]

and

                  N
1/2 * [1 - (2E) ]

Neither of these will ever be exactly 0.5 unless E is zero, but we can bring them arbitrarily close to 0.5. If we want the probabilities to be within some delta d of 0.5, e.g., then

                       N
( 0.5 + ( 0.5 * (2E) ) ) < 0.5 + d

Solving for N yields N > log(2d)/log(2E). (Note that 2E is less than 1, so its log is negative. Division by a negative number reverses the sense of an inequality.)

The following table gives the length N of the string that must be sampled for various degrees of skew in order to come within 0.001 of a 50/50 distribution.

Prob(1)EN
0.50.001
0.60.104
0.70.207
0.80.3013
0.90.4028
0.950.4559
0.990.49308

The last entry shows that even if the distribution is skewed 99% in favor of ones, the parity of a string of 308 samples will be within 0.001 of a 50/50 distribution. But, as we shall see in section 5.2, there are much stronger techniques that extract more of the available entropy.