Quantitative Poker: Utility, Part 1: The Basics

Saturday, January 22, 2011

Utility, Part 1: The Basics

Before we can dive into any models of various poker decisions, we first need to establish the building blocks of models for rational preferences under uncertainty.

In economics, the foundation of any approach to any decision made under uncertainty is expected utility theory, which quantifies risk aversion through establishing a correspondence with diminishing marginal utility of wealth.

Those familiar with the basics of utility can skip to the end of this post, but should stay tuned for the upcoming parts, where I will build upon these fundamentals in ways which are less common in abstract theoretical models, but which are specifically well-suited for practical poker applications.

Motivation

The common heuristic approach to decision-making in poker is to make decisions in order to maximize one's expected value, with volatility ("variance", as it is usually not-quite-accurately termed) an unquantified afterthought, managed through heuristic rules, if at all. People understand that less risk is preferable to more risk as long as expected value remains the same, but there is usually little consideration given to quantifying the value of risk relative to expected value.

How much expected value should a decision-maker be willing to give up in order to reduce the variance of a random payoff by a certain amount? More generally, how do rational decision-makers value the tradeoff between expectation and risk?

General Utility Functions

Preferences over different levels of wealth are quantified by assigning a utility function to each person or entity, a function which maps a level of wealth to a level of overall personal satisfaction derived from that wealth. The usual assumptions on a utility function are that it is:

Increasing — Everyone prefers more money to less money.
Continuous — There's no specific amount of wealth that is suddenly much more preferable to a slightly smaller amount of wealth.
Concave — The slope of the function is decreasing. As one has more wealth, an additional dollar is less valuable, e.g. a poor person is much happier finding $100 than a millionaire is. This is equivalent to the individual being risk-averse, rather than risk-neutral or risk-seeking.

Any function which satisfies these conditions is a potentially reasonable utility function. The precise form of the function will depend on the individual's specific preferences for different levels of wealth and, as we will see, his specific risk preferences.

Isoelastic Utility

One basic example is the isoelastic utility function, given by

$\dpi{150} \bg_white $u_{IE,\rho }(x)=\frac{x^{\left( 1-\rho \right) }}{\left( 1-\rho \right) }$, \textup{for }$\rho \geq 0$$

Notice that, for ρ=0, this is simply the identity function, which represents no diminishing marginal utility of wealth and no aversion to risk. As ρ increases, the marginal utility of wealth becomes more diminishing, so ρ can be seen as a parameterization of risk aversion. A higher value of ρ means a higher aversion to risk.

For ρ=0.5, this function looks like this:

This function satisfies all of the desired properties. Though the scale of this plot does not indicate it well, the function is always less than that of the identity function, so this utility function can be thought of a means of "discounting" wealth in a way that accounts for diminishing marginal utility of wealth. Note, however, that it is not necessary that the scale of the function match that of the wealth; we shall see that the particular values taken by the utility function are irrelevant for decision-making, as they get mapped back into dollars after accounting for the different random payoffs of an opportunity.

The isoelastic utility function is said to represent constant relative risk aversion (CRRA), as the individual's aversion to risk is always proportional to his wealth. With higher wealth, he is less averse to risk. This is a desirable property and is generally fairly consistent with real-life decisions and the rules of thumb that most poker players use in managing bankroll requirements as they move up in stakes.

Exponential Utility

Another simple example is the exponential utility function, given by

$\dpi{150} \bg_white u_{\exp ,c}(x)=1-\exp (-cx)$, for $c\geq 0$$

For c=1/150000, this function looks like this:

The exponential utility function is said to represent constant absolute risk aversion (CARA), as the individual's aversion to risk is always constant regardless of his wealth. In practice, few people would exhibit constant absolute risk aversion, as we should expect that most rational individuals' risk aversion should decrease as wealth increases, though perhaps not according to the proportional scale of the CRRA utility function.

The exponential utility function is bounded from above, but that does not mean that an individual with this utility function has any upper bound to the amount of wealth he prefers. We will see, however, that this does make the individual less likely to take risks for large amounts of money.

Utility and Risk Aversion

Let's say an individual who has a net worth of $500,000 and isoelastic utility with ρ=0.5 (defined on his net worth) is given the opportunity to bet all $500,000 on the flip a fair coin, receiving a payoff of $1,000,000 if it comes up heads and being broke if it comes up tails. What is the value to him of taking the bet? The expected value in the amount of wealth he will have after taking the bet is clearly $500,000, but the expected utility of this random payoff is given by

$\dpi{150} \bg_white \mathbb{E}u_{IE,\rho }(X)$ \\ \\ $=\frac{1}{2}\mathbb{E}u_{IE,\rho }(\$0)+\frac{1}{2}\mathbb{E}u_{IE,\rho }(\$1000000)$\\ \\ $=0+\frac{1}{2}\mathbb{E}u_{IE,\rho }(\$1000000)$ \\ \\ $=1000$$

Since the utility function is continuous and increasing, there is a unique dollar value, known as the certainty equivalent, that yields the same expected utility as any random payoff. It is the unique solution of the equation:

$\dpi{150} \bg_white u_{IE,\rho }(x)=1000\\\Rightarrow x=\$250000$$

Here, the certainty equivalent is $250,000. So while a completely risk-neutral individual should be indifferent between betting his $500,000 net worth on this flip or not, the risk-averse individual with these particular preferences would rather have $250,000 for certain than bet his $500,000 on the flip. Since having $500,000 for certain is even better than having $250,000 for certain, the risk-averse individual of course passes on this opportunity. He would only be willing to spend his net worth to have a 50/50 chance at having either $1,000,000 and $0 if his net worth were less than $250,000.

To get a feel for the practical implications of each of these two basic forms of utility functions, we can look at the certainty equivalents for similar situations of betting one's net worth on a coin flip, for varying values of net worth. The 1st column is the payoff for winning the coin flip (twice the net worth), the 2nd column is the certainty-equivalent value for the individual with isoelastic utility (with parameter ρ=0.5), and the 3rd column is the certainty-equivalent value for the individual with exponential utility (with parameter c=1/150000):

So, for example, an individual with exponential utility (with parameter c=1/150000) would only be willing to spend $4,916.68 on a 50/50 chance of winning $10,000.

A few simple observations:

For isoelastic utility, the certainty equivalent is always a fixed percentage of the expected value of the coinflip. This is true regardless of what we set the parameter ρ equal to. So an individual with isoelastic utility is willing to bet his entire net worth on any weighted coinflip with fixed probability of winning (or on any 50/50 coinflip with a fixed percentage overlay, as in the example here), regardless of his wealth. This is unlikely to reflect any real person's preferences for such opportunities, but might be OK when we consider situations where the bet is for less than one's net worth.
The certainty equivalents under exponential utility decrease significantly when more money is at risk. While the certainty equivalents in the table above for the smaller flips seem to be roughly in line with what most well-bankrolled poker players (with CARA utility, one's net worth relative to the bet size does not matter) would be willing to pay for these coinflips, most would likely be willing to pay more for the $1M flip. This disparity can't be rectified by playing with the parameter c; if we reduce c enough that the player would be willing to pay something somewhat closer to $500,000 for the $1M flip, then the certainty equivalents for the smaller flips become extremely close to the pure expected values.

So these two simple utility functions may each be imperfect for capturing real-life risk preferences, at least for individuals risking their entire net worth in the case of isoelastic utility.

These two utility functions are the most commonly-used in mathematical models due to their desirable analytical properties, but for the purposes of making practical poker decisions, where the discrete-time nature of poker opportunities makes it unlikely that the methods of calculus would lead to nice analytical solutions in many models anyway, we should be fine with choosing any admissible utility function that can be evaluated numerically. If we look at more practical situations where only a portion of one's net worth is at risk, we might be able to find a good fit with the isoelastic utility function, or we might be better-served by building some sort of ugly-but-practical "hybrid" utility function.

Eventually, we'll use the methods of utility functions to look at the following questions:

When players have practical and tax-conscious utility preferences, how much effective rake are we really paying for our chance at the glory of the WSOP Main Event title?
What sort of approximate hand-by-hand utility functions should the Loose Cannon on the PokerStars Big Game have?
How can a backer and a player formulate a split of a payoff in a way which is optimal for each of their personal risk preferences?
Full Tilt takes $1 out of the pot if you want to run it twice; under what conditions would we prefer to pay this fee to reduce volatility?
Does whether or not we would take a certain risk ever depend on how many opportunities we will be given to play that game? In particular, is it a fallacy to manage the risk in a unique opportunity differently because we are unable to "reach the long run" with it?

But first, coming up next...

Part 2: Finding or constructing a utility function that accurately represents practical risk preferences for poker players
Part 3: Effects of taxation — and they're BIG ones

9 comments:

OdatafanJune 11, 2011 at 4:36 AM
The idea of bounded utility (as in your exponential example) and of decreasing utility are well-studied in economics. But gamblers seem to experience increasing marginal utility. Without that, why would anyone play a slot machine or buy a lottery ticket? Is it essential to your theory of poker that utility diminish?

I wrote about this at http://wnio.blogspot.com/2011/06/utility-of-gambling.html
ReplyDelete
Replies
Mike SteinJune 11, 2011 at 5:19 AM
The idea of having higher wealth leading to higher-ROI opportunities is something I had never thought about on as general a level as you describe. That you do better in pure-EV terms by taking a -EV gamble before investing in something paying compound interest is counterintuitive but very clear, and that has some interesting implications. The overall perspective of such a rational gamble being a simple way of reallocating wealth between people with similar investment goals is quite elegant. I wonder how many of such cases that occur in practice would be crowded out when risk aversion is factored in.

One example along these lines that I did think about once is if a skilled poker player is in a casino and is looking at a poker game that he thinks he can crush, but the buyin is $500 and, despite having a sufficient bankroll, he only has $250 on him at the time, he'll probably have a higher expected utility than just going home by instead trying to quickly double up the $250 at roulette and playing in the poker game if he does (and going home otherwise).

As to the implications of this on marginal utility, I would say that the typical utility functions with decreasing marginal utility should be generally applicable to rational people in a vacuum (i.e. without specific investment goals, and without external investment opportunities where having more wealth would improve your ROI in a convex way). If one were a player with one of these specific investment goals, you could tweak your utility function to have a big jump at the target level of wealth, and all practical stuff you would do with it should still work nice.

I was planning on saying this: I think it's important to, at least in theory, demand that all rational players in a poker game have concave utility functions. In some sense, the rules of poker demand that decisions in the game are "supposed to" be made when they are +EV in the pure-EV sense. When players are risk averse, slightly -EV opportunities get passed up on, but that's not necessarily a big perturbation to the in-game decisions, especially because most players can be expected to be at least slightly risk-averse.

But it occurs to me that there's not really any fundamental difference between the perturbation of shutting out slightly-plus-EV moves and the perturbation of adding in slightly-minus-EV moves. So maybe it is fair to consider it.

The easy answer is that anyone who would have increasing marginal utility and would take a slightly-minus-EV move in poker is also someone who would gain utility by playing small-edge house games, so they could just play those instead of playing poker... but, if they're good at poker and are +EV in the game due to their skill, they would do better by playing the game and also taking the slightly-minus-EV opportunities. Poker would serve as a means of matching such rational risk-seeking people up with each other, just as the examples in your blog.

As to why anyone ever chooses to play a slot machine or lottery ticket, I think the other possible explanations are much more probable than legitimately rational decisions of individuals with increasing marginal utility. For one, such a legitimately rational person would do better by seeking out a similar person and gambling with them with no house edge. I think the vast majority of casino gambling is instead completely covered by factors such as willingness to pay for the entertainment of the process of gambling, heuristic probability biases, or just plain ignorance. My personal opinion is that there is more than enough evidence for all three of these explanations that economics need not necessarily search for other explanations, and for the first one, it can certainly be rational to pay for the entertainment value for those who happen to enjoy gambling.

I still feel like these are corner cases that don't need to be directly accounted for in poker theory, but you've given me a lot to think about. Your blog is very interesting and I shall be following it!
ReplyDelete
Replies
OdatafanJune 11, 2011 at 5:36 AM
So, it *does* change your winning strategy if you have marginal increasing utility, in that you'd take -EV gambles. Is that a losing strategy against players who are more conservative? If it happens that some players take -EV gambles and some don't, there may be words for this in Poker lingo.
ReplyDelete
Replies
Mike SteinJune 11, 2011 at 5:57 AM
It would change your winning strategy and make your overall strategy have lower EV, but may keep you +EV overall. It will indeed be directly sacrificing money to all other opponents, and the conservative or non-risk-seeking opponents will all benefit. It is certainly a losing perturbation to one's strategy.

More importantly for the point I was making, even if adding the losing gambles makes the otherwise skilled player -EV in poker, he should still have a higher ROI than he would in any casino game; he only takes the -EV gambles in the poker game which favors his utility function (i.e. ones where he is getting the minimum level of odds that he'd demand for any gamble, such as a casino game), and he still gets plenty of actual +EV opportunities along the way, so on average, he should do better than the "minimum odds threshold" induced by his utility function.

Most commonly, the word for a player who takes -EV gambles in poker lingo is just "fish". In my personal experience and conjecture, there should be very few good players who would ever knowingly make a move that was -EV within the game. Maybe the risk-seeking players you describe tend to not become good poker players as often, or maybe good poker players are so conditioned to focusing entirely on EV that they neglect to take -EV gambles that may favor their own interests out of discipline.
ReplyDelete
Replies
OdatafanJune 13, 2011 at 12:43 AM
You point out the difference between a -EV move in the game and on the other hand viewing the whole poker game from buy-in to cashing out as a gamble. I believe that we can take a -EV move on the whole game as part of a winning investment strategy. I wonder whether -EV moves within the game are a bad choice because other players will punish that player. I guess you think that they definitely will take advantage of that player. That makes sense.

I'd like to view the chip stack as wealth and moves in the game as investments made. A player with more chips has an advantage, I hear, and can make better investments. If so, then the -EV argument applies: if you make a -EV move and either lose or win and invest the money well conservatively, the average of these two strategies is better than simply staying behind in chips and making comparably worse bets than your chip-leading opponent.

If I try to manufacture a situation in which the opponents cannot punish me.. maybe if I am the last player to decide whether to call or fold, on the last round of a hand. Even in this case, I will show my cards, which will reveal my -EV move. So, I could very much believe that although a -EV move can be advantageous if winnings can be invested conservatively, in competitive games it will be punished.

How much of an advantage it is to have extra chips? Suppose you make two investments: you enter two players in two poker games. After some time, one of your players is winning and the other player is losing. The casino offers you the option to transfer some chips from one player to the other. Should you transfer chips to the player who is doing badly, or to the player who is doing well? This reflects the relative marginal ROI of the two situations.
ReplyDelete
Replies
OdatafanJune 13, 2011 at 12:45 AM
Replace: the average of these two outcomes is better / the average of these two strategies is better
ReplyDelete
Replies
Mike SteinJune 15, 2011 at 6:36 PM
Actually, that a player with more chips has an advantage in *cash game* poker is a popular misconception. In fact, it can only be a disadvantage, though the effect is often negligible. If you have $1,000 and the rest of the table has $100 each, you have no in-game advantage, as you'll never get all-in for more than $100. If, however, you have $100 and the rest of the table has $1,000 each, you are playing the exact same game as if the other players all had $100 each, *except* that if you get all-in in a multiway pot, the other players might bet each other out after you are all-in, which can only increase your chances of winning the main pot.

In tournaments, I have always assumed a model of concave "chip stack utility". Conventional wisdom is that in almost all tournament situations, you need better than a 50% chance to commit all of your chips. This doesn't directly extend logically to a concave chip stack utility function, but I suspect that this is still the case except for possibly some special cases... I can't think of any right now.

There might be some advantages outside of the game where you might be willing to risk your stack with only a 49% chance to win, such as the example I discussed earlier where you want to move up to a soft higher-stakes game but don't have enough cash on hand. But, for a cash game, there's definitely no in-game advantage for a big stack that would cause in-game utility to be anything but concave (usually very close to linear, and only slightly concave for the effect of multiway all-ins that I discussed).
ReplyDelete
Replies
OdatafanJune 20, 2011 at 3:30 AM
OK... if there is no move-by-move advantage to having a large stack, then I agree that no one should take a -EV move. And yet, you are right that I had expected the winnign player to get a move-by-move advantage. I had imagined that the player with the bigger stack might sit comfortably behind that large stack of chips and play only the best hands, while the player with fewer chips might feel, move by move, the pressure of having to pay the ante, forcing the low-stack player to accept less-desirable bets. But as you say, if one player has 1000 chips, and the next-richest player has only 100, then this hand is played as though all players had only 100 chips, with the difference that the player with 1000 cannot go "all-in" and the player with 1000 chips cannot lose in this hand.

Analogous to an "interest rate" accruing to a stack of chips is the ante draining smaller stacks faster than they drain big stacks. The amount to which the ante bothers you is -A/(PS) (total ante -- big blind plus small blind /(number of players times your chip stack)). If you always fold, you'll see PS/A hands before you run out of money, and surely we'd all prefer that number to be higher. That interest rate is concave; the "compound interest rate" accruing after n moves is -nA/PS which is also negative and concave in S. When interest is concave in wealth, -EV moves take you from your current value to a linear combination of the values you'd have if you won or lost and are always bad. So... never make a -EV move in Poker when S > 5A (where I picked 5 because I can't imagine worrying about being forced to gamble on my last hand, when 5P hands remain before that hand comes).

Consider your penultimate move -- on the next move you'll be forced to bet everything in order to pay a blind. At this moment, the interest curve might be locally convex: if you win, you get some breathing space and a lower per-hand fee, at least for the next few hands; if you lose, you still face a 100% fee on the next turn. So the "interest-rate versus wealth" curve is -100%, -100%, -20% (A/PS). So if to call right now has an EV below 0 but above the EV of a random hand (what you expect next turn), then you should call now with tomorrow's blinds-money. And, anticipating this, on the *previous* hand, if you see a -EV option which has EV better than you expect to get on your penultimate hand, then you should take it. And so on, arguing back; since we're cutting "between" the next-hand's expected value and 0 every turn, the -EV bets we expect to take are half as big each turn we go, so very soon we should ignore this and just say "Taking a 0EV move on your last 4 or 5 hands, to avoid taking a -EV move in your last 2 or 3 hands." These -EV moves work because if you do nothing you will be charged a high fee in ante; your only choices are to bet the money into a -EV move close to 0EV or wait until the blinds come, or choose to bet the money on a random hand when the blinds come.
ReplyDelete
Replies
Mike SteinJune 25, 2011 at 4:21 PM
I'm not totally sure that it's the case here, but the idea that the proportion of one's stack that a blind or ante represents is something that affects play is usually a misconception as well. The common forms it appears in are stuff like "I've only got 10 times the big blind left, I should just go with any decent hand I get" or "In a shorthanded game, hand selection changes because the blinds come around quicker". Neither of these comes to the wrong conclusion, but they are likely the wrong approach.

For a cash game, at least, the usual goal being to maximize one's expected value (utility) on every hand, one would not concern onesself with geometric growth rates of the percentage of stack that an ante represents. I think your example makes sense from an expected utility perspective if the player's entire net worth is at risk in the poker game, but if the player is playing a cash game for only a small portion of his net worth, he could just add on extra money if he wanted his small stack to be bigger.

For a tournament, your approach would make sense if your expected utility of prize payout (i.e. your position in the tournament) was concerned with the geometric growth of your stack, but I'm not sure how close that is to reality.

The hypothetical perfect map from tournament chip stack to expected utility of prize payout is certainly one that takes future hands into consideration. It would consider that, by folding your tiny stack first to act, you'll be putting it all in blind on the next hand. There could definitely be inflection points in this scenario for small stack sizes where your *chip stack* utility became convex at certain points, but I'm not convinced that they would exist at all. They probably don't matter very much practically, in that they might only exist for a narrow window of tiny stack sizes, but I'm not sure.
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.