Quantitative Poker: Levels of Randomness: Beyond G-Bucks

Let's start with a simple example poker hand.

$1/$2 heads-up cash game with no rake. For some reason, our opponent is down to $9 before the start of the hand. We are in the small blind. We are dealt the J♦2♥ and move all-in. Our opponent holds the Q♣Q♦ and calls. The board runs out J♥8♥2♠Q♥A♥. We drag the $18 pot.

Did either player "get lucky" in this hand?

Some would say that we got lucky to beat pocket Queens with a garbage hand. Others might say that the Queens got lucky to outdraw us on the turn, or maybe we got lucky to redraw them on the river. Still others might say that our opponent was lucky to wake up with a hand as strong as Queens when he got shoved on for 4.5 big blinds.

More prudently, are any such perspectives on "luck" and any of the different levels of randomness in a hand useful and deserving of a strategic player's focus? If so, which is most important?

Most approaches use expectations conditioned on some chosen level of early-street random events and player-controlled moves to average out the "luck of the draw" that remains after this conditioning. This is most easily seen in through the first few levels, which have been well-established and are quickly understood by most students of poker. These perspectives are useful, but there's nothing stopping us from going to higher levels when the situation allows for it.

Levels of Randomness

Level	Title	Averages out over
0	Results	Nothing
1	Sklansky Bucks	Cards dealt after all players are all-in
1.5	Galfond Bucks	… and one player's range
2	Range vs. Range	… as well as the other player's range
3	Strategy vs. Strategy	All cards dealt and lines taken
4	Strategy vs. Distribution	… and our Bayesian inference of our opponents' strategies

Level 0: Results

If we condition on everything (in our example, all of the cards, from the holecards to the river), then we're just looking at the results of the hand. A beginning player operating on this level of understanding would conclude that we played the example hand correctly because we went on to win it and achieved the best possible result (+$9). He would have lost sleep trying to figure out what he did wrong if he had instead lost the hand.

Such novices have always been deservedly and universally mocked by all levels of barely-competent poker players. We don't need to spend much time here... we're all better than Level 0.

Level 0.5: Any nonsense in between

Any perspective in between Level 0 and Level 1, such as one that notices that the final outdraw occurred on the river, is pointless. There is no information to be gleaned from the order in which cards happen to fall after all players are all-in, at least none that has any practical or outcome-based use. Do your emotional, fallacy-driven human brain a favor and just minimize the table once you've gotten all-in. You'll see if you won or lost when the table pops back up, and sweating the details is a distraction at best and an impediment to your performance at worst.

Level 1: Hand vs. Hand, a.k.a. All-in Adjusted EV, a.k.a. Sklansky Bucks

David Sklansky's Fundamental Theorem of Poker, introduced in his book The Theory of Poker, states that players need not worry about what cards happen to fall after the relevant action in the hand has concluded; as long as we play our hand in a way that turns out to be the same way we would have played it if we knew our opponent's cards, we win. If we can make them play their particular hand in a way that they would not have if they could have seen our cards, we also win. The all-in luck averages out in the long run.

This level conditions on all cards that have been dealt prior to all players becoming all-in and averages over only the remaining cards to be dealt out. For our particular example hand, our all-in equity with J♦2♥ against Q♣Q♦ is 11.27%. Therefore we received a Level 1 outcome of -$6.97, worse than that of just taking the -$1 of folding our small blind. We lost "Sklansky Bucks" in this hand, and we made a "Fundamental Theorem of Poker mistake" in our example hand because had we seen our opponent's cards, we would have folded preflop.

To term this a "mistake" in any practical sense is absurd; shoving J♦2♥ for 4.5 big blinds is part of the optimal strategy, assuming that any move other then all-in or fold in this subgame would be suboptimal (see The Mathematics of Poker or any other table for short-stacked heads-up NL Nash equilibria).

Then, while it is a useful guideline to lift the thinking of beginners out of the hell of Level 0 and for illustrating that all-in adjusted EV is an unbiased estimator of winnings, the Fundamental Theorem of Poker has few practical applications in most poker situations. Instead, we must go deeper...

Level 1.5: Range vs. Hand (Galfond Bucks)

Phil Galfond expanded Sklansky's idea to a much more practically-useful measure in his 2007 G-Bucks article. Galfond, in pioneering the idea of rangewise thinking, explained that, since from our perspective, our opponent has a probability distribution of hands in any given spot, we can make assumptions about his hand range and average out the results over those hands as well as the remaining board cards. Its applications occur more often than that of any Level 1 measure, since it applies to river calls in hands where players did NOT get all-in.

However, since Level 1.5 randomness requires assumptions to be made about our opponent's ranges (or our own, when looking at a decision from the opponent's perspective), it's not something that can be easily coded and evaluated as a variance reduction measure over a database of hands. Computational simplicity ended in Level 1.

Going back to our example hand, if both our opponent and ourselves are playing the Nash equilibrium for 4.5 big blind heads-up push-or-fold poker (a reasonable assumption), then he will be calling our shove with

[22+,A2s+,K2s+,Q2s+,J2s+,T2s+,95s+,85s+,75s+,65s,54s,A2o+,K2o+,Q2o+,J3o+,T6o+,97o+,87o]

which is about 67.7% of hands, ignoring card removal effects. Our equity with J♦2♥ against this range is 36.83%, so we received a Level 1.5 outcome of -$2.37, worse than that of just taking the -$1 of folding our small blind. So, since our opponent happened to have a calling hand, we lost "Galfond Bucks" in this hand as well as "Sklansky Bucks".

Note that this level of randomness ignores the fact that our opponent will be folding to our shove 32.3% of the time, which is enough to make our shove profitable and part of the optimal strategy.

While Level 1.5 randomness is as far as one might be able to go for solving complicated poker decisions within a given hand, we can go further for situations as simple as our 4.5 big blind heads-up push-or-fold spot.

Level 2: Range vs. Range

Galfond's measure exists with the aim of guiding practical decisions for a given player's specific cards. While this is sufficient for making in-hand decisions, if we want to produce a more general measure of randomness, we can expand to looking at the ranges of both players, rather than just one. Moreover, thinking about our ranges away from the table, rather than approaching each new situation on a hand-by-hand basis as we go, helps us balance our overall strategies.

In particular, we know each player's entire optimal range for when 4.5 big blinds are shoved and called heads-up, so we might as well condition on those ranges and average out over even the players' particular hands from those ranges.

When we know all players' ranges in a given spot, the particular cards that they end up holding are as irrelevant as the particular card that comes on the river when they were already all-in.

To put it another way, I don't really care that you happened to wake up with Queens.

The optimal 4.5 big blind shoving range is

[22+,A2s+,K2s+,Q2s+,J2s+,T2s+,93s+,84s+,74s+,64s+,53s+,A2o+,K2o+,Q2o+,J2o+,T6o+,96o+,86o+,76o]

which has 49.03% equity against the noted calling range our opponent will employ. Our Level 2 outcome of -$0.17 captures the fact that, when we get all-in with these stack sizes, we could each have any of the hands in our ranges.

At least for situations as simple and solvable as this one, we're achieving additional variance reduction and thus better accuracy by taking a Level 2 measure of outcomes. This approach is also consistent with proper strategic rangewise thinking in this spot — no need to invest any emotional or rational energy into the fact that we got called when we had one of the worst hands in our range, as we knew it was correct within our overall strategy to include that hand.

Level 3: Strategy vs. Strategy

All the levels so far have implicitly conditioned on the line of action that occurred in the hand, that is, they only "work" after particular player moves were made which led to an all-in or a river call.

In our example hand, all previous levels have ignored the probability that our opponent folds to our shove. Since we know the exact ranges that our players get all-in with, we also know the complete strategies that each player employs: get all-in with the hands in that range, fold the others. Here, a "strategy" for a particular poker situation is a massive list of probability distributions of ALL possible moves a player will make in EVERY possible poker situation over ALL streets. 4.5 big blind heads-up push-or-fold poker is one of the few games simple enough where a complete strategy can fit onto one sheet of paper.

The range we are shoving is about 73.2% of all starting hands. So the Level 3 outcome of $0.12 (= .732 * [.677 * -$0.17 + .323 * $2] + .268 * -$1) for the subgame where we are on the button with 4.5 big blinds is what actually captures the value of being dealt a hand of poker in that spot when players are employing optimal strategies. At this level, we could verify optimality of each range by verifying that any other range produces a lesser result.

When we know all players' strategies and thus know how often each particular line of action gets taken, even the particular lines and actions taken are as irrelevant as the cards to come when all-in.

More generally, this level of randomness lets us "work backwards" from all-in or river situations where Level 1.5 randomness applies. For example, we may have reached a river decision in a hand where we made a call that was Level 1.5 correct (positive in "Galfond Bucks" expectation), but the strategies on prior streets that led us to that river spot with the ranges we had may or may not have been winning strategies. Expanding our focus to strategies rather than river calls or all-in moves is the only way to evaluate early street play.

Level 4: Strategy vs. Inferred Strategy Distribution

When we leave the realm of simplified poker games with easy Nash equilibria and enter a world where our opponents make mistakes, we may not be able to assume that our opponent follows a single exact strategy, especially when we are not very familiar with the player. Instead, we can assume that our opponent will be employing a particular strategy from a range of strategies. Just as we don't put our opponents on one specific hand in poker situations, we should allow for the fact that their strategies will be either dynamic or impossible to completely infer.

We could expand our example hand by adding the possibility that our opponent may not be playing optimally. For example, given our assessment of his play so far, we could estimate that there's a 50% chance that he is a smart player employing the Nash equilibrium strategy and a 50% chance that he's playing some suboptimal strategy that includes perhaps folding in the BB too often. Then, under the assumption of these particular probabilities, the profitability of every move we make will be the average of its profitability against the Nash equilibrium and against the suboptimal strategy. Maximizing our Level 4 payoff will involve determining the optimal exploitive strategy based on our Bayesian inference about our opponent's strategy — and the range of strategies we put him on will be updated constantly as we learn more about our opponent and/or he modifies his strategy. Of course, all poker players do this, informally. It's the basis of any strategic decision at any part of a poker hand.

We won't bother trying to calculate this one for the example hand here. In fact, once we make more complicated assumptions on a wider range of different opponent strategy profiles, we are well beyond the realm of computational tractability by the time we reach this level. Nonetheless, if we were the theoretical infinitely-knowledgeable and rational player, this would be what we would do in practice.

Recall that there are "Level 1/Fundamental Theorem of Poker mistakes" that are clearly not play mistakes, and that Level 1.5 randomness expanded upon Level 1 randomness in a way that eliminated many of the most obvious examples of such "false mistakes". Level 4 randomness expands upon Level 3 randomness in the same way: a well-reasoned strategy choice that happens to be thwarted by an unexpected strategy from the opponent was not necessarily a real mistake, just as failing to fold our J♦2♥ when our opponent held Q♣Q♦ was not a real mistake. We can't know our opponent's exact strategy, just as we can't know his exact holecards. We can only make strategically-optimal inferences about each.

Conclusions

Note that each level of randomness is an unbiased estimator of the same (Level 0) true results. That is, just as all-in adjusted EV (Level 1) is an unbiased estimator of results, so are the higher levels of randomness. In a shove-or-fold situation with optimal strategies, the Level 3 result is a constant number which is the exact expected value of the per-hand results. In particular, when applying variance reduction techniques to play in a specific situation such as 4.5 big blind heads-up push-or-fold poker when both you and your opponent are playing the Nash equilibrium, you will get a more accurate measure of the true expected value of your play by computing the Level 3 results.

From each level of randomness, we can look at the type of "luck" associated with it. In our example hand, we were:

Level 1 lucky, as we managed to win an all-in where we were an underdog,
Level 1.5 unlucky, as we ran into a hand near the top of our opponent's range,
Level 2 unlucky, as not only did we run into the top of our opponent's range, we did so with the bottom of our range,
Level 3 and Level 4 neutral, as we assumed our opponent was playing the Nash equilibrium with probability 1, and we ourselves played the Nash equilibrium.

Since each successive level adds extra information from the previous level into what it averages out over, the types of "luck" are additive.

It's worth noting that, in any all-in situation, whatever the outcome is, at least one player will have been Level 1 lucky. In this hand, it was us.

There is some elegance in the fact that Level 1 luck, the most salient and blatant form of luck, is the one that players have the least control over. Conversely, Level 4 luck is under full control of the players as it is entirely dependent on the players' relative skills at any given moment in time.

So maybe we did "get lucky" to win in that particular hand. The Level 1 luck is, in some sense, the foundation level of any hand involving an all-in. Does that make it the most important type of luck? In some ways, yes, but in many ways, no.

Each level of luck seems to offer its own lesson about avoiding various degrees of results-oriented thinking:

Level 1 randomness tells us not to second-guess ourselves just because we happen to lose an all-in confrontation.
Level 1.5 randomness tells us not to second-guess ourselves just because we happened to run into the top of our opponent's range.
Level 2 randomness tells us not to second-guess ourselves just because we happened to run our well-crafted, profitable range into a perhaps suboptimal range from our opponent, e.g. if our opponent chose to bluff-catch too light with a range that matched up poorly against our value betting and bluffing ranges. (The differences between Level 1.5 and Level 2 are small, but they should exist in spots where we employ mixed strategies.)
Level 3 randomness tells us not to second-guess ourselves just because we happened to choose an overall strategy which matched up poorly against our opponent's particular strategy, as long as we made our strategy choice in an informed way.
and, finally...

Level 4 randomness tells us that we should only second-guess ourselves when we chose an overall strategy which was suboptimal relative to the distribution of strategies we could optimally infer our opponent to have.

Level 1 luck is the type of luck that most players have developed the ability to ignore by training themselves not to be results-oriented.

Then, once we're ignoring Level 1 luck, we can start to ignore Level 2 luck and be confident in our overall strategies, even when we run the bottom of our range into the top of our opponent's range.

Once we've achieved indifference to Level 2 luck, we're on the final stretch towards conditioning ourselves to ignore Level 3 and Level 4 luck, and then we've reached nirvana. Completely emotional and strategic neutrality to all instrumental randomness in the game. Complete focus only on making optimal inferences about our opponent's play and executing optimal exploitive strategies. Nobody got lucky in the example hand.

A lofty goal, to be sure, but worthy of pondering and a fine target to shoot for. Even if you don't get all the way there, if you can get past Level 1.5, you'll be further along than most, and you'll be thinking more strategically about the game.

2 comments:

ΑΝΤΙ-ΟΠΑΠSeptember 8, 2011 at 10:39 AM
Do i have to be proud to be the first one to tell you that you that this post is epic??? Keep it coming sir, you have at least one internation super fan
Mike SteinSeptember 8, 2011 at 1:23 PM
Thanks! It is very reassuring to know that this one makes sense to at least one other human out there ;)

Note: Only a member of this blog may post a comment.

Monday, March 14, 2011

Levels of Randomness: Beyond G-Bucks

2 comments: