In a perfect world, as far as mathematical modeling and easy closed-form manipulation are concerned, we'd be thrilled if every random variable we ever dealt with had the Gaussian (a.k.a. Normal) Distribution. The most important of its many desirable properties is that the average of any finite-variance random variable converges to a Gaussian distribution, due to the Central Limit Theorem.
Poker results, on a per-hand or per-tournament basis, of course, do not have the Gaussian distribution. For one, the probability distribution of poker results are discrete, rather than continuous. Beyond that, we would generally expect much more weight on the extreme outcomes in a poker result distribution than the corresponding Gaussian distribution of the same mean and variance.
So, how "close" to Gaussian are the distributions of poker results for different games? If the distribution of the results of a single hand of a certain game of poker are not close to Gaussian, how many hands must be played before the average becomes close to Gaussian via the Central Limit Theorem? And what are the practical implications for bankroll management?
I ran some simulations on some old data from my own play, assuming that the approximate probability distribution of per-hand results in each game was simply the empirical distribution based on my historical data. There are a number of problems with this that will keep this from being anywhere near a perfect assumption, but it's the best we can do. With large enough sample sizes in (hypothetical) constant game conditions, it would be fine.
$1/2 CAP NL play is 6-handed NL Holdem with 30bb (30 big blind) stacks. The betting cap ensures that no hands are ever played deeper than 30bb, though some are played shallower against shortstacked opponents.
$1/2 RUSH NL is exclusively 9-handed NL Holdem at Rush tables, with 100bb starting stacks (auto-reloading to 100bb every hand) and frequent deeper stacks. In the current online poker marketplace, the Rush games seem to be the best opportunity for gathering data on deeper-stacked play due to their high liquidity and speed of play.
$1/$2 PL is 6-handed PL Omaha, mostly on shallow or cap tables, around 40bb stacks.
Visualizations of Normality
For the sake of developing an intuition for "how long the long run is" for achieving approximate normality, I looked at histograms of the empirical probability distribution of poker hand outcomes, plotted against the Gaussian distribution of matching parameters (the thin blue curve).
I plotted the behavior of these distributions after 1 hand (top-left), 10 hands (top-right), 1,000 hands (bottom-left), and 100,000 hands (bottom-right). The x-axis is in big blinds, rather than dollars.
For the $1/2 CAP data:
After 1,000 hands, I was surprised to see just how close to Gaussian the distribution already was. There isn't that even that much visual improvement going up to 100,000 hands. The shortstacked nature of the cap games seems to produce fast convergence to approximate normality.
For the deeper-stacked $1/2 RUSH data:
For the $1/$2 PLO data:
As a final point of comparison, I thought it would be interesting to look at heads-up tournaments, where the mass of the probability distribution would be only on the two extreme possibilities of -1 and + ~1 (adjusting for rake). I used some fictitious data based the rake structures of $55+$2.50 HU SNGs and on a winrate of 55%, with the buyin size normalized to 100:
So, as we've seen here, and as the Central Limit Theorem guarantees, once we play enough hands or tournaments, our results will be very close to Gaussian. Consistency of yearly results might be the #2 concern for profit-minded players, and as far as this goes, it looks like Gaussian approximations should be great here for cash game players putting in any amount of volume. Live players, however, will be suffering from both fewer hands/year and higher variance from deeper stacks... my guess would be that a live player would want to be putting in at least 500 hours/year to be able to assume approximate normality or annual results.
However, the #1 concern for a poker player is the ability to sustain one's bankroll. It turns out that a popular and effective bankroll management formula is derived from an assumption of perfect normality — is this assumption a good fit?
Accuracy of Gaussian ruin probabilities
While we've seen that year-end results should be quite close to Gaussian, a poker player who goes broke in the middle of the year due to some perhaps non-Gaussian sudden downswings is not going to be able to reach the end of the year to achieve his nice, nearly-Gaussian result. So, in terms of the probabilities that the path of one's bankroll would cross a certain lower bound (usually zero), does this data behave similarly enough to that of perfectly Gaussian data?
Note that the true Gaussian distribution is unbounded, so a very small percentage of paths will have quick movements of very large magnitude. Actual poker distributions are bounded, so in this way, we would expect the Gaussian paths to fall to zero more often. On the other hand, actual poker distributions have higher kurtosis (a.k.a. fatter tails, that is, more likelihood is put on extreme results than in the Gaussian distribution), which would counteract this effect. Which of these effects will dominate?
If poker hands were perfectly Gaussian, then we could very closely approximate our (discrete) bankroll path by the (continuous) stochastic process of Brownian Motion — essentially, a continuous extension of Gaussian random variables. In this case, the ruin probability, the probability of ever hitting 0 from a given starting point and a winrate with a given mean and variance, would follow this formula, widely-known from Chen and Ankenman's The Mathematics of Poker but also an easily-derived property of Brownian Motion:
where B is (starting) bankroll, μ is the mean of the poker results process, and σ is its standard deviation.
Of course, we don't have a nice formula for the actual ruin probabilities given that our poker results are not perfectly Gaussian, but we can run Monte Carlo simulations to approximate these ruin probabilities for different starting bankrolls and compare them to the exact result for the Gaussian approximation.
The left column is starting bankroll, in big blinds (for the tournament, a value of 100 represents one tournament buyin).
For the $1/2 CAP data:
For the $1/2 RUSH data:
For the $1/$2 PLO data:
For the fictitious HU tournament distribution:
We first notice that the risk of ruin for the $1/2 PLO data set is 1 in all cases, as of course will always be the case when one's winrate is negative... your author is still working on his PLO game and has a very limited sample size so far. Whoops.
We then notice that, across the board, the Gaussian approximation to the ruin probability is higher than the simulated ruin probability with the empirical distribution. The difference appears to be decreasing in bankroll size, as we would expect from the Central Limit Theorem. The first few lines are for excessively small bankrolls, so they are not of any particular practical interest. The difference appears to be largest for the tournaments, as we would expect.
It looks like the boundedness of the true distribution is a bigger effect than the higher kurtosis, so the result is that actual ruin probabilities are lower than the easy formula suggests, which is great! Moreover, since the errors are small (especially for reasonable bankroll sizes which yield reasonably low practical risks of ruin), at least for cash games, we'll be a little bit extra-conservative by simply using the easy Gaussian formula.
Conclusions and Implications
Basic cash game results seem to be close enough to Gaussian for both terminal results and probabilities of hitting sufficiently far-away lower bankroll bounds along the way. Therefore we can rely on the easy Gaussian ruin probability formula for modeling purposes, and we will approximate long-term results (such as when evaluating expected utility over one year) with Gaussian distributions, at least outside of tournament play.
We should be careful about drawing definitive broad conclusions from these simulations. We have treated only a few different poker games, at only one moment in time for the poker economy, and only of one player's particular strategy. We should expect that the results may be different for poker games with deeper stacks, or with looser players. Games with higher variance or games with less continuous one-hand result distributions (such as limit games) should be further from Gaussian.