Today’s poker is not your grandfather’s game. Games have become unimaginably aggressive, and the level of play by both amateurs and pros has risen dramatically. Today, aggressive betting and bluffing have become the norm, and plays that were once considered state of the art are common knowledge. It's argued that the cause of this remarkable transition in both approach and skill level has been the influx of thousands of online pros competing in tough games over the internet. Increasingly though, the online/live distinction doesn't hold up— today, pros are grouped more usefully by their poker methodology than by their preferred field of play. The old guard are exploitative players, always trying to gather enough information on their opponents to stay one step ahead. Increasingly, the highest limits of online play have been dominated by game theory optimal (GTO, or optimal, for short) players who don’t much care what their opponent does and seek to play a strategy designed in the long run to beat any other strategy in the long run.
I started my serious poker career in the spring of 2008 firmly on the exploitative side of the fence. Since then, I’ve traversed to the other side and become a champion of GTO play both as player and teacher, all while competing in the highest-stakes games. In 2012 I was lucky enough to be the biggest winner in online poker cash games, taking home about $4 million. (I try not to delude myself: being a winner is primarily skill, being the biggest winner requires a lot of luck.) While I was a winning pro before 2012, I attribute much of my extraordinary success in the past year to my work on optimal poker.
For our purposes, an optimal player seeks to find the optimal strategy, which is the strategy such that any deviation from it breaks even or loses against our opponent’s best counterstrategy. For any game, there exists at least one optimal strategy. As a GTO poker pro, my time and effort go to getting as close to the optimal strategy as possible. Each step along this journeyis called a near-optimal strategy.
As a poker player I have to prioritize the practical over the theoretical at all times. In other words, no matter how interesting this game theory stuff may be, I have to ask, “How is this going to make me money?” A frequent criticism of GTO play in the poker community is that it isn’t particularly profitable, more specifically that GTO play may be useful in minimizing losses against excellent players,and it fails to win significantly against weaker players. A version of this argument is made by David Sklansky in his classic Theory of Poker:
Game theory cannot replace sound judgment. It should only be used when you think your opponent's judgment is as good as or better than yours or when you simply don’t know your opponent. Furthermore, game theory can be used accurately to bluff or call a possible bluff only in a situation where the bettor obviously either has the best hand or is bluffing. (189)
The argument will become clearer through example—using a simple game we’re all familiar with: Rock, Paper, Scissors (RPS). A simple thought experiment should allow you to find the optimal strategy for RPS in just a few minutes. Imagine that before each throw, you had to write down on a slip of paper the frequency with which you would throw out rock, paper, or scissors—and then hand this slip of paper to your opponent. For example, rock half the time, paper half the time, and scissors never. Since your opponent now knows your strategy is to never play scissors, he’ll never play rock, and of his remaining choices paper is superior since it always breaks even or wins, while rock breaks even or loses. More generally, anytime our opponent knows that our frequencies are out of balance, we make it easy for him to pick a specific throw that will beat us in the long run.
Consequently, the optimal strategy for RPS must make all of our frequencies equal in order to defend against an opponent who knows our strategy—therefore, playing rock, paper and scissors one third each is the optimal strategy. A funny feature of RPS’s optimal strategy is that any strategy played against it will have an expected value (EV) of 0. Even the most exploitable strategies— for example, always rock—break even against the optimal strategy. If the optimal strategy in poker is like the optimal strategy in RPS and breaks even against all, or many, of our opponent’s counterstrategies, then, it can be argued, game theory should never replace our judgment. We play poker to win, and any sound poker strategy should aim to give us an expectation in excess of the rake.
Let’s think about one more simple game: tic-tac-toe (TTT). TTT is a game of complete information, which means we can see all of our opponent’s past moves. Any competent TTT player knows that games between two good players will always end in a draw. But let’s say we open with an X in the center and our opponent responds mistakenly with an O in the middle:
We’ve already won the game, and our opponent can only avoid a loss if we make a mistake. In the language of optimal strategies, our opponent has played an exploitable strategy, and if we respond with:
X O
X
X O
then we’re playing the optimal strategy. Had both TTT players played the optimal strategy, the game would end in a draw, which is similar to how we saw the optimal strategy always break even in the long run in RPS. But unlike RPS, when one player deviates from the optimal strategy in TTT, his opponent will be able to secure a win. The technical term for a suboptimal strategy that always breaks even or loses against the optimal strategy is a dominated strategy. My main project as an optimal poker pro is to eliminate from my game as many dominated strategies as possible.
Chess is an example of a much more complicated game of complete information. Chess is complicated enough that it has still not been perfectly solved but simple enough that top computers almost always win against the best humans, and against average human players, the computer always wins. For the computer, chess is a lot like tic-tac-toe when played against a novice who doesn’t see that his strategy is dominated. In a game as complicated as chess, the optimal strategy will always secure a victory against even advanced suboptimal strategies.
In my experience poker is a lot like chess, and not very much at all like RPS. Optimal (or near-optimal) poker will absolutely crush even relatively strong strategies played by intelligent humans, because even professional players (myself included) employ many dominated strategies.
It turns out that some forms of poker are becoming much like chess. The simplest form of poker, from a programming perspective, is heads-up limit hold ’em, and today’s best bots routinely beat world-class human players by a significant margin. More specifically, world-class heads-up limit hold ’em pros typically win 1to 5 big blinds per hundred hands (bb/100) against somewhat weaker players, and the best bots now beat the pros by about the same amount as the pros beat the games for. It’s even possible to calculate how much a bot would lose to an opponent’s best counterstrategy (although it’s impossible to calculate the optimal strategy itself), and the best bots lose to that strategy by about twice as much as they beat the pros!*
But even if we play poker with the intention of exploiting our opponents, it’s often extremely helpful to know near-optimal play in order to pick an approptiate exploitative strategy, especially if our opponent is also a strong player . For example, say we’re playing someone who calls 40 percent of the time after opening on the button and being 3bet by the big blind. If we have an idea of what the optimal 3-bet calling percentage is, then this information can be used to encourage us to 3-bet more aggressively for value (if she’s calling too much) or bluff more aggressively (if she’s folding too much). In other words, we can use our best guess of what optimal play is to make adjustments to exploit our opponents.
In addition,a near-optimal strategy in poker wins against just about any strategy an opponent is likely to play. This means we can ignore our opponent’s strategy most of the time and still expect to have a healthy winrate. If you’re an online player who multi-tables, ignoring your opponent’s strategy frees up a massive amount of attention.
At least for much of the poker-playing audience, GTO poker burst onto the scene via Chen and Ankenmann’s The Mathematics of Poker. But this text differs significantly in its approach from Applications, not least because upon opening Mathematics, you’ll be confronted by equations on nearly every page. More specifically, Mathematics uses toy games (usually simpler poker games that can be solved) to illustrate theoretical concepts. The advantage of this is that it allows precise solutions for simple games to be found; but it also leaves the interpretive process to the reader since the games do not reflect full-scale poker in many respects.
For anyone but the extremely dedicated and mathematically inclined, formulating a strategy for full-scale poker on the basis of toy games is going to be next to impossible. Until now, the few specialists familiar with these concepts have not made them publicly available. And to my knowledge, Matthew Janda is one of the first to apply these difficult but important concepts consistently to a full-scale poker game.
The analysis of preflop play in Applications highlights the breadth of Matthew’s approach. Were we to search for a hard solution to six-handed preflop play, the problem would be intractable—there are just too many possibilities. So instead of going for certainty, this text makes a few assumptions that seem reasonable and then consistently applies them around the six-handed poker table. The result is a set of preflop guidelines strikingly similar to my own strategy which I’ve arrived at through playing hundreds of thousands of hands (over many years) at the highest stakes. For most positions, the differences in my opening ranges and those argued for in Applications are not significant. Where the differences are significant, I’ve found I’m either playing exploitatively, or in some cases I’m just making bad plays.
Matthew certainly has not solved the game; in fact, he won’t even try to, because he’s wise enough to know that the attempt is currently impossible. Instead, Matt has taken various principles that must necessarily apply to the full game and added to those principles some very smart assumptions that allow him to erect a unified GTO framework for six-max no-limit hold ’em from preflop to river. Armed with this framework and some software to aid him with the combinatorics, Matthew has been able to construct a robust set of strategic guidelines for six-max no limit hold ’em which, if followed thoughtfully, should yield a substantial winrate even against stiff opposition. My game has improved substantially due to my reading of Applications, and I’m confident a close reading of it will help keep you ahead of the curve in today’s increasingly tough poker games.
—Ben “Sauce123” Sulsky