tag:blogger.com,1999:blog-48376270297426181062017-09-08T16:57:24.602-07:00Computational BackgammonMy attempt to build a neural network-based backgammon playerMark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.comBlogger101125tag:blogger.com,1999:blog-4837627029742618106.post-91469034003134816182012-08-20T15:08:00.001-07:002012-08-21T15:23:15.639-07:00Improved GNUbg benchmarks<div dir="ltr" style="text-align: left;" trbidi="on">The GNUbg team (in particular, Philippe Michel) has <a href="http://lists.gnu.org/archive/html/bug-gnubg/2012-08/msg00010.html">created</a> new benchmark databases for Contact, Crashed, and Race layouts, using the same set of board positions but rolling out the equities with more accuracy. This corrects the significant <a href="http://compgammon.blogspot.co.uk/2012/06/gnubg-crashed-benchmark-issue.html">errors</a> found in the old Crashed benchmark, and improves the Contact and Race benchmarks.<br /><br />They are available for download <a href="http://www.gnubg.org/media/nn-training/pmichel">here</a>, in the benchmarks subdirectory.<br /><br />Philippe also did some work on improving the board positions included in the Crashed training database, which is available for download in the training_data subdirectory at that link.<br /><br />I re-ran the statistics for several of my players, as well as for PubEval. Also Player 3.6 as the most comprehensive benchmark.<br /><br /><div class="nobrtable" style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th><div style="margin: 0px;">Player</div></th><th><div style="margin: 0px;">GNUbg Contact ER</div></th><th><div style="margin: 0px;">GNUbg Crashed ER</div></th><th><div style="margin: 0px;">GNUgb Race ER</div></th><th><div style="margin: 0px;">PubEval Avg Ppg</div></th><th><div style="margin: 0px;">Player 3.6 Avg Ppg</div></th></tr><tr><td><div style="margin: 0px;">GNUbg</div></td><td><div style="margin: 0px;">10.5</div></td><td><div style="margin: 0px;">5.89</div></td><td><div style="margin: 0px;">0.643</div></td><td><div style="margin: 0px;">0.63</div></td><td><div style="margin: 0px;">N/A</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/08/player-36-longer-training-results.html">Player 3.6</a></div></td><td><div style="margin: 0px;">12.7</div></td><td><div style="margin: 0px;">9.17</div></td><td><div style="margin: 0px;">0.817</div></td><td><div style="margin: 0px;">0.601</div></td><td><div style="margin: 0px;">0.0</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/05/player-35-new-input-escapes-from-bar.html">Player 3.5</a></div></td><td><div style="margin: 0px;">13.1</div></td><td><div style="margin: 0px;">9.46</div></td><td><div style="margin: 0px;">0.817</div></td><td><div style="margin: 0px;">0.597</div></td><td><div style="margin: 0px;">-0.0027</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/04/player-34-incrementally-better.html">Player 3.4</a></div></td><td><div style="margin: 0px;">13.4</div></td><td><div style="margin: 0px;">9.63</div></td><td><div style="margin: 0px;">0.817</div></td><td><div style="margin: 0px;">0.596</div></td><td><div style="margin: 0px;">-0.0119</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/03/player-33-td-then-sl.html">Player 3.3</a></div></td><td><div style="margin: 0px;">13.4</div></td><td><div style="margin: 0px;">9.89</div></td><td><div style="margin: 0px;">0.985</div></td><td><div style="margin: 0px;">0.595</div></td><td><div style="margin: 0px;">-0.0127</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/02/player-32-adding-crashed-network.html">Player 3.2</a></div></td><td><div style="margin: 0px;">14.1</div></td><td><div style="margin: 0px;">10.7</div></td><td><div style="margin: 0px;">2.14</div></td><td><div style="margin: 0px;">0.577</div></td><td><div style="margin: 0px;">-0.041</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/02/player-32q-new-filter-strategy.html">Player 3.2q</a></div></td><td><div style="margin: 0px;">33.7</div></td><td><div style="margin: 0px;">26.2</div></td><td><div style="margin: 0px;">2.45</div></td><td><div style="margin: 0px;">0.140</div></td><td><div style="margin: 0px;">-0.466</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/01/player-24-berliner-primes-inputs.html">Player 2.4</a></div></td><td><div style="margin: 0px;">18.2</div></td><td><div style="margin: 0px;">21.7</div></td><td><div style="margin: 0px;">2.05</div></td><td><div style="margin: 0px;">0.484</div></td><td><div style="margin: 0px;">-0.105</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/01/benchmark-2-different-number-of-hidden.html">Benchmark 2</a></div></td><td><div style="margin: 0px;">21.6</div></td><td><div style="margin: 0px;">23.2</div></td><td><div style="margin: 0px;">5.54</div></td><td><div style="margin: 0px;">0.438</div></td><td><div style="margin: 0px;">-0.173</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://compgammon.blogspot.co.uk/2012/04/pubeval-revisited-using-listnet.html">PubEval (ListNet)</a></div></td><td><div style="margin: 0px;">41.7</div></td><td><div style="margin: 0px;">50.5</div></td><td><div style="margin: 0px;">2.12</div></td><td><div style="margin: 0px;">0.048</div></td><td><div style="margin: 0px;">-0.532</div></td></tr><tr><td><div style="margin: 0px;"><a href="http://www.bkgm.com/rgb/rgb.cgi?view+610">PubEval</a></div></td><td><div style="margin: 0px;">44.2</div></td><td><div style="margin: 0px;">51.3</div></td><td><div style="margin: 0px;">3.61</div></td><td><div style="margin: 0px;">0</div></td><td><div style="margin: 0px;">-0.589</div></td></tr></tbody></table></div><div class="nobrtable" style="background-color: white; color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"><br /></div>For the games against PubEval I ran 40k cubeless money games; standard errors are +/- 0.006ppg. Down to Player 3.2, for the games against Player 3.6 I ran 400k cubeless money games to get more accuracy; standard errors are +/- 0.002ppg or better. For players worse than Player 3.2 I played 100k games against Player 3.6 as the average scores were larger; standard errors are +/- 0.004ppg.<br /><br />Phillippe Michel was gracious enough to provide the GNUbg 0-ply scores against the newly-created benchmarks. Also it seems like I had the scores against the old benchmarks incorrect: they were Contact 10.4, Crashed 7.72, and Race 0.589. The Contact score was close, but the other two I had significantly worse.<br /><br /><br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-19890696110197211342012-08-19T00:57:00.001-07:002012-08-19T00:57:56.112-07:00Player 3.6: longer training results<div dir="ltr" style="text-align: left;" trbidi="on">I haven't had much time lately to work on this project, but while I'm engaged elsewhere I thought I'd run training for a long period and see whether it continued to improve.<br /><br />It did - fairly significantly. So my players before clearly were not fully converged.<br /><br />The result is my new best player, Player 3.6. Its <a href="http://compgammon.blogspot.co.uk/2012/02/better-benchmark-gnubg-benchmark.html">GNUbg benchmark</a> scores are Contact 12.7, Crashed 11.1, and Race 0.766. In 400k cubeless money games against <a href="http://compgammon.blogspot.co.uk/2012/05/player-35-new-input-escapes-from-bar.html">Player 3.5</a> it averages 0.0027ppg +/- 0.0018 ppg, a small improvement.<br /><br />In 40k games against <a href="http://compgammon.blogspot.co.uk/2012/01/benchmark-2-different-number-of-hidden.html">Benchmark 2</a> it averages 0.181 +/- 0.005 ppg, and against PubEval 0.601 +/- 0.006 ppg.<br /><br />For training I used supervised learning with three parallel and independent streams: one with alpha=0.01, one with alpha=0.03, and finally one with alpha=0.1. This was meant to test the optimal alpha to use.<br /><br />Surprisingly, alpha=0.01 was not the best level to use. alpha=0.03 improved almost 3x as quickly. alpha=0.1 did not improve well on the Contact benchmark score but did improve the best for the Crashed benchmark score.<br /><br />I take from this that alpha=0.03 is the best level of alpha to use for long term convergence.<br /><br />The Crashed benchmark score we know is not that useful: the Crashed benchmark itself is <a href="http://compgammon.blogspot.co.uk/2012/06/gnubg-crashed-benchmark-issue.html">flawed</a>, and a multi-linear regression <a href="http://compgammon.blogspot.co.uk/2012/02/bug-in-benchmark-calculation-fixed.html">showed</a> very little impact on score of the Crashed benchmark. That said, I tried a little experiment where I used the Contact network for crashed positions in Player 3.5 and it definitely worsened performance in self-play: 0.04ppg on average. That is a significant margin at this point in the player development.<br /><br />I ran 4,000 supervised learning steps in the training, for each of the three alpha levels. In each step I training on a randomly-arranged set of Contact and Crashed training positions from the training benchmark databases. This took a month and a half. The benchmark scores were still slowly improving for alpha=0.01 and alpha=0.03, so there is still scope for improvement. I stopped just because the GNUbg folks have put out new benchmark and training databases that I want to switch to.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-6305054521286272372012-06-12T10:27:00.001-07:002012-06-12T10:27:59.091-07:00GNUbg Crashed benchmark issue<div dir="ltr" style="text-align: left;" trbidi="on">It looks like the Crashed benchmark set in the GNUbg benchmarks isn't very accurate in some cases.<br /><br />There is a <a href="http://lists.gnu.org/archive/html/bug-gnubg/2012-06/msg00006.html">thread</a> discussing it in the GNUbg mailing list.<br /><br />Interesting to know, and hopefully the GNUbg team will improve on it; but the Crashed benchmark score is <a href="http://compgammon.blogspot.co.uk/2012/02/bug-in-benchmark-calculation-fixed.html">not very predictive</a> for overall gameplay, as I've discovered while comparing players of different strengths.<br /><br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-42726329354181608952012-05-14T14:14:00.000-07:002012-05-14T14:14:18.452-07:00Player 3.5: new input, escapes from the bar<div dir="ltr" style="text-align: left;" trbidi="on">I tried another <a href="http://compgammon.blogspot.com/2012/04/new-inputs-failure-bar-hitentry.html">new input</a> for contact and crashed networks: this time, the expected escapes if you have a single checker on the bar. That is, looking at the available spaces in the opponent home board and weighting the probability of landing in the space with the standard <a href="http://compgammon.blogspot.com/2012/01/berliners-bkg-program.html">escape count</a> from the Berliner primes calculation. It is meant to give some indication of how good or bad it'd be to get hit. I'm focusing on inputs along these lines because when looking at which positions are calculated most poorly in the benchmarks, it tends to be boards where there is a significant chance of being hit and landing behind a prime.<br /><br />This one had some success, and while the improvement is still incremental, it resulted in my best player to date. The resulting player that uses the new input is Player 3.5. It is identical to <a href="http://compgammon.blogspot.com/2012/04/player-34-incrementally-better.html">Player 3.4</a>, except for two new inputs: the input as described above, one for each player.<br /><br />Its GNUbg benchmark scores are Contact 13.0, Crashed 11.5, and Race 0.766. Player 3.4's scores are 13.3, 11.7, and 0.766, so noticeably better but still nothing dramatic (though notably some improvement in Contact, the most important benchmark). It seems that to get a significantly stronger player I'll have to add a bunch of inputs, each of which offers reasonably incremental benefits.<br /><br />In cubeless money player against Player 3.4, it scores an average +0.0033ppg +/- 0.0021ppg in 400k games. Against PubEval it scores an average +0.592ppg +/- 0.005ppg in 100k games and wins 69.5% of the games.<br /><br />Still not nearly <a href="http://compgammon.blogspot.com/2012/02/gnubg-0-ply-benchmarks-reference.html">as good</a> as <a href="http://compgammon.blogspot.com/2012/01/pubeval-benchmark.html">GNUbg 0-ply</a>! But creeping closer.<br /><br />To be honest I'm not really sure whether the improved performance came because of the new input or because I slightly changed the training algorithm. In this case I started with random weights for the new inputs and ran supervised learning against the GNUbg training databases (contact & crashed). And instead of bouncing back and forth between a large alpha (1) and smaller alphas, I just used a small and constant alpha of 0.03. The resulting benchmark score slowly improved over 1,100 iterations, which took several days to run.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-68250955194869267632012-04-27T18:56:00.001-07:002012-04-27T18:56:09.094-07:00New inputs failure: bar hit/entry probability<div dir="ltr" style="text-align: left;" trbidi="on">I've been spending a little time looking at cases where my <a href="http://compgammon.blogspot.com/2012/04/player-34-incrementally-better.html">Player 3.4</a> does poorly in the GNUbg contact <a href="http://compgammon.blogspot.com/2012/02/better-benchmark-gnubg-benchmark.html">benchmarks database</a>, to get some feel for what new inputs I might try.<br /><br />It looks it's leaving blots too often when the opponent has a good prime blocking the way out of the home board.<br /><br />So I tried two new inputs: the odds of entering the opponent's home board if there were a checker on the bar; and the odds of hitting an opponent blot in his home board if there were a checker on the bar.<br /><br />I tried two training approaches: first, adding random weights for just those four new weights (the two inputs times two players) and doing supervised learning on the GNUbg training databases; and also starting from scratch, random weights everywhere, and doing TD training through self-play and then SL on the GNUbg training databases.<br /><br />The conclusion: neither worked. In both cases the new player was about the same as or a little worse than Player 3.4. So these aren't the right new inputs to add.<br /><br />Back to the drawing board.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-7523969329006493382012-04-25T12:26:00.004-07:002012-04-25T12:26:42.913-07:00Jump model final version posted<div dir="ltr" style="text-align: left;" trbidi="on">I've posted a new version of my jump model for cube decision points:<br /><br />http://arxiv.org/abs/1203.5692<br /><br />This version is quite similar to the last one, with just a few elaborations added after another set of very productive discussions with Rick Janowski. He's been a huge help in pulling this together.<br /><br />I doubt this paper will change again, though I'll probably revisit the model in the future with another paper. Probably to focus on how to estimate the local jump volatility accurately.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com2tag:blogger.com,1999:blog-4837627029742618106.post-24188581463906583242012-04-17T21:06:00.000-07:002012-04-22T11:33:38.035-07:00PubEval trained using ListNet<div dir="ltr" style="text-align: left;" trbidi="on">I spent some time looking at PubEval again - not my implementation, which is fine now, but rather how Tesauro came up with it in the first place. One discussion <a href="http://www.scholarpedia.org/article/Td-gammon">suggests</a> that he trained it using "comparison training", a machine learning approach he seems to have come up with - some kind of supervised learning on a manually-prepared set of benchmarks. Each benchmark (I'm assuming) was a list of moves given a starting point and a dice roll, where the moves were ordered by goodness.<br /><br />I tried to reproduce this. I couldn't find any proper references to "comparison training", but there's a lot of relatively recent literature on machine learning algorithms for generating rankings, which is the same sort of thing.<br /><br />We can do a lot better than Tesauro's hand crafted training set: we have the GNUbg benchmark databases that are much larger and more accurate.<br /><br />So what we want is an algorithm that we can feed a training set, where each element of the set has the five boards listed for each move and the rolled-out equities for each. The inputs that define the board are the PubEval inputs, and the evaluation function should be a linear function of the inputs (like PubEval is).<br /><br />Wikipedia has a <a href="http://en.wikipedia.org/wiki/Learning_to_rank">nice summary</a> of different machine learning approaches to ranking estimators.<br /><br />The <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=70428">ListNet</a> algorithm seems like a good choice. I implemented it and trained it on the GNUbg contact and race benchmark databases.<br /><br />It fairly quickly converged to a better solution than Tesauro's original PubEval. That is, the weights I found can be plugged into the usual PubEval setup, but give a slightly stronger player. Not much better, but noticeably stronger. Not surprising given the more comprehensive training set.<br /><br />The weights themselves, and the output values, were quite different to PubEval. The ListNet algorithm effectively trained the regression to approximate equity, so in this approach the board evaluations correspond to equity (though optimized for equity differences on similar boards rather than outright equity).<br /><br />The GNUbg benchmark scores were: Contact 41.5, Crashed 47.5, and Race 1.90. This compares to PubEval's <a href="http://compgammon.blogspot.com/2012/02/bug-in-benchmark-calculation-fixed.html">scores</a> of 44.1, 49.7, and 3.54.<br /><br />The weights are available on my Dropbox account: <a href="http://dl.dropbox.com/u/57178206/pubeval_race.txt">race</a> and <a href="http://dl.dropbox.com/u/57178206/pubeval_contact.txt">contact</a>.<br /><br />In 100k cubeless games (with variance reduction) against PubEval it scores +0.043ppg +/- 0.004ppg. Again, noticeably better.<br /><br />Of course this is a terrible player compared to neural network players, but it's neat to be able to reproduce something like what Tesauro did with PubEval. And it was a good excuse to play around with the newer machine learning algorithms focused on ranking.<br /><br />As well this might be an interesting training approach for a neural network. The network would be optimized for checker play, so would be less efficient at absolute equity estimation required for doubling decisions. But perhaps one could have two sets of networks, one optimized for checker play, the other for doubling decisions.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-48315112145402648692012-04-10T13:33:00.000-07:002012-04-10T13:35:23.727-07:00Average number of games in a match<div dir="ltr" style="text-align: left;" trbidi="on">Some more interesting statistics on match play: the average number of games per match.<br /><br />I ran self-play for two players using <a href="http://compgammon.blogspot.com/2012/04/player-34-incrementally-better.html">Player 3.4</a> for checker play (<a href="http://compgammon.blogspot.com/2012/04/cubeful-vs-cubeless-equities-in-checker.html">cubeful equity</a> optimization) and <a href="http://compgammon.blogspot.com/2012/04/optimal-janowski-cube-life-index-in.html">Janowski match model</a> with cube life index 0.70 and looked at how many games it took on average to finish a match.<br /><br />The most interesting result is that the average number of games divided by the match length seems to converge reasonably well to a value around 0.65.<br /><br />Here is a chart of the results: blue line/left axis shows average number of games, red line/right axis shows the average number of games divided by the match length:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/-ZgTS_-S8A6A/T4SWIUVB5wI/AAAAAAAAAd8/f8h7XBXLN4w/s1600/matchgames.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="458" src="http://4.bp.blogspot.com/-ZgTS_-S8A6A/T4SWIUVB5wI/AAAAAAAAAd8/f8h7XBXLN4w/s640/matchgames.jpg" width="640" /></a></div><br />The only other <a href="http://www.bkgm.com/rgb/rgb.cgi?view+712">source</a> I could find for similar statistics gave similar results (out to match length 11).<br /><br />I ran the experiment with x=0.5 and x=0.9 as well to see how that affected the converged average number of games per match length. x=0.5 gave a converged ratio of 0.51; x=0.9 gave 0.90. This compares to 0.65 for x=0.7.<br /><br />So the average number of games per match divided by the match length converges (for long matches) to something surprisingly close to the cube life index itself! I'm sure this is just a coincidence, but an interesting one.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-67576380067158146472012-04-10T12:39:00.003-07:002012-04-11T07:14:36.968-07:00Optimal Janowski cube life index in match play<div dir="ltr" style="text-align: left;" trbidi="on">In my <a href="http://compgammon.blogspot.com/2012/04/janowski-model-extended-to-match-play.html">last post</a> I looked at extending Janowski's cubeful equity model to match play.<br /><br />The conclusion: match play also favors a cube life index very close to 0.70.<br /><br />I played a Janowski match strategy with different cube life indexes against a Janowski match strategy with cube life index 0.70 as a benchmark. I ran 40k matches with variance reduction and recorded the average points per match.<br /><br />The results:<br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span><br /><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th>Match<br />Length</th><th>x=0.5</th><th>x=0.6</th><th>x=0.8</th><th>x=0.9</th></tr><tr><td>3</td><td>-0.004</td><td>0.000</td><td>-0.004</td><td>-0.007</td></tr><tr><td>5</td><td>-0.007</td><td>+0.007</td><td>+0.002</td><td>-0.024</td></tr><tr><td>7</td><td>-0.021</td><td>-0.003</td><td>-0.008</td><td>-0.029</td></tr><tr><td>9</td><td>-0.028</td><td>-0.006</td><td>+0.003</td><td>-0.037</td></tr></tbody></table><br />If I fit parabolas through the results and force them to pass through zero ppm at x=0.7, I find optimal cube life indexes of x=0.61 for match length 3, x=0.64 for match length 5, x=0.68 for match length 7, and x=0.69 for match length 9.<br /><br />All average points per match have a standard error of +/- 0.004ppm, so the statistics are marginal for the shorter match lengths.<br /><br />There is some evidence for a smaller cube life index for shorter matches, but not much. In general the optimal match cube life index looks very close to the optimal money cube life index.<br /><br /><b>UPDATE</b>: I ran longer simulations for more values of the cube life index for match lengths 3, 5, and 7 to try to get more accurate statistics. From those data I get optimal cube life indexes of 0.70, 0.67, and 0.69 for match lengths 3, 5, and 7 respectively. So no evidence of a smaller optimal cube life index for shorter matches: everything should use 0.70.<br /><br />That said, the performance difference for short matches of using a suboptimal cube life index is pretty infinitesimal. It becomes a bigger deal for longer matches.</div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-32292237915403100132012-04-09T08:15:00.000-07:002012-04-10T12:29:27.807-07:00Janowski model extended to match play<div dir="ltr" style="text-align: left;" trbidi="on">When I was looking at <a href="http://compgammon.blogspot.com/2012/03/full-match-equity-tables.html">match equity tables</a> I wondered whether you could extend <a href="http://compgammon.blogspot.com/2012/03/janowski-model-revisited.html">Janowski's model</a> (using a "cube life index" to interpolate between dead and live cube equities as a proxy for the jumpiness of game state) to match play. I'm pretty sure this is what GNUbg does based on their <a href="http://www.gnubg.org/documentation/doku.php?id=appendix#7">docs</a>.<br /><br />Turns out it's pretty straightforward if you assume the same match equity table as you calculate with Tom Keith's algorithm, which is a live cube model - it assumes game-winning probability changes diffusively, and that W and L (the conditional points won on win or lost on loss) are independent of the game-winning probability. That's the same as Janowski's live cube limit, except W and L are calculated from entries in the match equity table instead of the usual money scores for wins, gammons, and backgammons.<br /><br />The Keith algorithm mentioned before gives you the cubeful match equity in the live cube limit. The dead cube limit has cubeful equity that's linear in P, running from -L at P=0 to +W at P=1. The model cubeful equity is just the weighted sum of the live and dead cube cubeful equities.<br /><br />I implemented this to see how it compares to using a Janowski money model for doubling decisions in tournament play. That's of course inefficient - it doesn't account for match equity - but it's an interesting benchmark to show how much a match-aware doubling strategy matters.<br /><br />Checker play was <a href="http://compgammon.blogspot.com/2012/04/player-34-incrementally-better.html">Player 3.4</a> optimizing on <a href="http://compgammon.blogspot.com/2012/04/cubeful-vs-cubeless-equities-in-checker.html">cubeful equity</a>, and I ran 40k matches (with <a href="http://compgammon.blogspot.com/2012/04/self-play-variance-reduction.html">variance reduction</a>) for a range of match lengths and cube life indexes. For a given cube life index, both the match and money doubling strategies share the same cube life index, which seems a fair comparison. Remember, for money play, a cube life index of x=<a href="http://compgammon.blogspot.com/2012/03/optimal-janowski-cube-life-index_30.html">0.70 was optimal</a>.<br /><br />The entries in the table are average points per match in the 40k matches. All values have a standard error of +/- 0.005ppm.<br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span><br /><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th>Match<br />Length</th><th>x=0</th><th>x=0.25</th><th>x=0.50</th><th>x=0.75</th><th>x=1</th></tr><tr><td>3</td><td>+0.056</td><td>+0.047</td><td>+0.051</td><td>+0.066</td><td>+0.054</td></tr><tr><td>5</td><td>+0.073</td><td>+0.080</td><td>+0.091</td><td>+0.105</td><td>+0.067</td></tr><tr><td>7</td><td>+0.115</td><td>+0.138</td><td>+0.137</td><td>+0.142</td><td>+0.061</td></tr><tr><td>9</td><td>+0.115</td><td>+0.148</td><td>+0.150</td><td>+0.154</td><td>+0.081</td></tr></tbody></table><br />The main conclusion here is that using a proper match-aware doubling strategy makes a huge difference in outcome. The impact is larger for longer matches because any per-game edge gets magnified over a match. In principle for very long matches the match strategy should become the money strategy, but for matches up to length 9, anyways, that is not apparent in the statistics.<br /><br />This assumes the same match equity table as before, which is based on the live cube model. Really I should recalculate the match equity table assuming the same Janowski model for cube decisions, which I think will change it a bit. But I'll leave that for another day.<br /><br />Or I could extend my <a href="http://compgammon.blogspot.com/2012/04/jump-model-revisited.html">jump model</a> to match play - it should be a relatively simple extension, since like with Janowski it's just about changing W and L to be based off match equities.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-20318239745337669272012-04-07T13:36:00.002-07:002012-04-08T16:50:23.969-07:00Cubeful vs cubeless equities in checker play decisions<div dir="ltr" style="text-align: left;" trbidi="on">So far all my players have made checker play decisions by choosing the position with the best cubeless equity, even after I introduced doubling strategies to decide when to offer & take doubles. This is clearly suboptimal: the doubling strategies can calculate cubeful equity, so the checker play should choose moves based on the position with the best cubeful equity, not best cubeless equity.<br /><br />I'm referring to money games; for match play they should optimize match equity.<br /><br />I extended my backgammon framework to allow the players to optimize on cubeful equity in checker play to see how much that would improve things.<br /><br />The answer, it seems: at least for my intermediate-strength players, not much!<br /><br />I played 100k cubeful money games (with <a href="http://compgammon.blogspot.com/2012/04/self-play-variance-reduction.html">variance reduction</a>) between two players (both <a href="http://compgammon.blogspot.com/2012/04/player-34-incrementally-better.html">Player 3.4</a>) that both use <a href="http://compgammon.blogspot.com/2012/03/optimal-janowski-cube-life-index_30.html">Janowski x=0.7</a> for their doubling strategy. The first used cubeful equities for checker play and the second used cubeless equity.<br /><br />The first player scored -0.0004ppg +/- 0.0044ppg. Hardly a significant advantage to using cubeful equity when choosing the best move - the score is indistinguishable from zero.<br /><br />I also thought I'd run the cubeful-equity player through the GNUbg benchmarks. The benchmarks are for cubeless play, so that should give something worse than for the cubeless-equity player. But the benchmarks are all about choosing the best move out of the possible moves, so all that matters there is relative equity difference, not absolute equity, and perhaps that's not affected much by cubeful vs cubeless.<br /><br />The cubeless benchmark scores are Contact 13.32, Crashed 11.69, and Race 0.766.<br /><br />Using a cubeful-equity player, always assuming a centered cube, the scores were 13.44, 12.16, and 0.766. So the Contact and Crashed scores got a little bit worse, but hardly changed. Using cubeful equity in checker play decisions does not seem to make a big difference in almost every case.<br /><br />Assuming player-owned cube the scores were 13.41, 12.03, and 0.768. Assuming opponent-owned cube the scores were 13.42, 11.82, and 0.764. So cube position does not matter much either.<br /><br />The conclusion: it doesn't matter much if you use cubeful or cubeless equity in checker play decisions.<br /><br />Nonetheless the cubeful equity performance is marginally better, and there are probably edge case conditions where it matters more, so I'll start using cubeful equity in money play.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-43064289679874244422012-04-06T14:07:00.002-07:002012-04-07T03:37:15.663-07:00Player 3.4: incrementally better<div dir="ltr" style="text-align: left;" trbidi="on">I let the supervised learning algorithm train for a while longer, starting with the <a href="http://compgammon.blogspot.com/2012/03/player-33-td-then-sl.html">Player 3.3</a> networks (i.e. the same networks and inputs).<br /><br />Its <a href="http://compgammon.blogspot.com/2012/02/better-benchmark-gnubg-benchmark.html">benchmark scores</a> were Contact 13.3, Crashed 11.7, and Race 0.766, so my best player yet, but only incrementally better than Player 3.3's scores of 13.3, 12.0, and 0.93.<br /><br />On the most important benchmark, Contact, it was unchanged. And in self-play against Player 3.3 (100k games with <a href="http://compgammon.blogspot.com/2012/04/self-play-variance-reduction.html">variance reduction</a>) its score was zero within a standard error of 0.0009ppg. Not exactly a startling improvement!<br /><br />Nonetheless it is measurably better in the other GNUbg benchmarks so I'll start using it as my best player.<br /><br />Playing 100k cubeless money games against PubEval it scored +0.586ppg +/- 0.001ppg, winning 69.22% of the games; Player 3.3 in the same games scores +0.585ppg +/- 0.001ppg and wins 69.20% of the games. So again very little difference, but still incrementally my best player.<br /><br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-88296536385417231992012-04-06T10:03:00.004-07:002012-04-06T13:20:29.896-07:00Self-play variance reduction<div dir="ltr" style="text-align: left;" trbidi="on">I've done a lot of bot self-play to gather statistics on relative strength and so on.<br /><br />One obvious way to reduce variance in the results is by running half the results with player 1 vs player 2, then half the results with player 2 vs player 1, but using the same random seeds as with the first half. That is, in both halves the dice are the same, but the players alternate taking them. That significantly reduces the impact of dice luck. In the limit of both players being exactly equal in terms of strategy this will return exactly 0 points per game and 50% wins, so perfect variance reduction.<br /><br />Note that only half as many independent trials are used; that tends to make the standard error worse. But the variance reduction should more than compensate.<br /><br />I haven't done this before, which is a little embarrassing, honestly.<br /><br />But, now that I've thought of it, I figured I'd do a little statistical analysis to investigate how effective the variance reduction actually is.<br /><br /><b>The first example</b>: playing 10k self-play money games where both players use <a href="http://compgammon.blogspot.com/2012/03/player-33-td-then-sl.html">Player 3.3</a> for checker play, the first player uses a Janowski doubling strategy with x=0.7, and the second player uses Janowski with x=0.6. In my <a href="http://compgammon.blogspot.com/2012/03/optimal-janowski-cube-life-index_30.html">previous experiments</a> where I ran 1M money games with no variance reduction I got an average score of +0.018ppg with a standard error of 0.003ppg.<br /><br />No variance reduction: +0.017ppg +/- 0.032ppg<br />Variance reduction: +0.026 +/- 0.014ppg<br /><br />Both are consistent with the more accurate estimate, but the standard error with variance reduction was 2.3x smaller. That corresponds to the standard error when running 2.3^2=5.3x as many simulated games with no variance reduction, so saves over a factor of 5 in run time.<br /><br /><b>The second example</b>: playing 10k self-play cubeless money games, Player 3.3 against PubEval.<br /><br />No variance reduction: +0.610ppg +/- 0.014ppg<br />Variance reduction: +0.583 +/- 0.012ppg<br /><br />In this case the variance reduction doesn't help much. That's likely because the two strategies are sufficiently different that they make different moves, and the same dice throws applied to different positions imply different luck.<br /><br />So when two players are comparable the variance reduction is quite powerful. When they imply significantly different checker play the technique is less powerful, but still worth doing.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com1tag:blogger.com,1999:blog-4837627029742618106.post-28334768494494846422012-04-05T17:09:00.000-07:002012-04-05T17:09:28.809-07:00Jump model revisited<div dir="ltr" style="text-align: left;" trbidi="on">After conversations with Rick Janowski I realized that I'd approached the jump model incorrectly. I've since revamped the whole thing and <a href="http://arxiv.org/abs/1203.5692">reposted the corrected paper</a> on arxiv.org - the same link as before, overwriting the old (incorrect) version.<br /><br />I feel pretty confident about this second cut. The implied Janowski cube life index is right around the <a href="http://compgammon.blogspot.com/2012/03/optimal-janowski-cube-life-index_30.html">optimal value I found</a>, and jump volatility found by optimizing the model parameter in bot self-play ties out well with statistical estimates of jump volatility.<br /><br />I put the <a href="https://github.com/mghiggins/bgjump/blob/master/bgjumpequity.py">python code</a> I used to generate cubeful equities onto github, released under the GPL. It contains functions for calculating cubeful equity in the three cube states (player-owned cube, unavailable cube, and centered cube). Three methods for calculating the equity: numerically, solving the exact model; a linear approximation; and a nonlinear approximation that improves on the linear one. They are all developed in the paper.<br /><br /><div style="text-align: center;"><span class="Apple-style-span" style="font-size: x-large;">Cube Handling In Backgammon Money Games Under a Jump Model </span></div><br /><div style="text-align: center;"><span class="Apple-style-span" style="font-size: large;"><a href="http://arxiv.org/abs/1203.5692">Abstract</a></span></div><br /><div style="text-align: justify;">A variation on Janowski’s cubeful equity model is proposed for cube handling in backgammon money games. Instead of approximating the cubeful take point as an interpolation between the dead and live cube limits, a new model is developed where the cubeless probability of win evolves through a series of random jumps instead of continuous diffusion. Each jump is drawn from a distribution with zero mean and an expected absolute jump size called the “jump volatility” that can be a function of game state but is assumed to be small compared to the market window.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Closed form approximations for cubeful equities and cube decision points are developed as a function of local and average jump volatility. The local jump volatility can be calculated for specific game states, leading to crisper doubling decisions.</div><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-78669870972333311862012-03-30T11:08:00.003-07:002012-03-30T11:08:55.535-07:00Optimal Janowski cube life index revisited<div dir="ltr" style="text-align: left;" trbidi="on">After fixing my implementation of Janowski's doubling strategy I tried <a href="http://compgammon.blogspot.com/2012/03/optimal-janowski-cube-life-index.html">again</a> to find the optimal cube life index (assuming we just use a single constant index).<br /><br />As before, I found the optimal cube life index is 0.7.<br /><br />To test this I ran doubling strategies with different cube life indexes against one with a fixed index of 0.7 to see if any other index would outperform. For indexes near 0.7 I ran 1M self-play games, using Player 3.3 for checker play. For indexes a bit further out I used 100k self-play games.<br /><br />Here are the results:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://2.bp.blogspot.com/-a_y4awALdVw/T3X2QeIAz8I/AAAAAAAAAd0/P52JymreExg/s1600/janowskicompare.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="472" src="http://2.bp.blogspot.com/-a_y4awALdVw/T3X2QeIAz8I/AAAAAAAAAd0/P52JymreExg/s640/janowskicompare.jpg" width="640" /></a></div><br />Two interesting points to note. First is that the probability of win continues to increase to x=0.95 even though the average score decreases, due to the asymmetric market windows of the two players. Second is that there is a fairly dramatic falloff in performance (both in score and probability of win) for x>0.95.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-79904941844263090982012-03-28T08:46:00.001-07:002012-03-28T08:46:14.546-07:00Janowski model revisited<div dir="ltr" style="text-align: left;" trbidi="on">I was a bit incorrect in how I interpreted <a href="http://compgammon.blogspot.com/2012/03/doubling-cube-and-money-games.html">Janowski's paper</a>.<br /><br />The real way to think about his model is as a linear interpolation between the dead cube equity and the live cube equity.<br /><br />The dead cube equity is always P*(W+L)-L, normalized to the current cube value. The live cube equity is piecewise linear, and has different forms for three cases:<br /><br />1) Cube centered. 0<P<TP, equity runs from -L to -1; TP<P<CP, equity runs from -1 to 1; CP<P<1, equity runs from 1 to +W.<br />2) Cube owned by player. 0<P<CP, equity runs from -L to +1; CP<P<1, equity runs from +1 to +W.<br />3) Cube owned by opponent. 0<P<TP, equity runs from -L to -1; TP<P<1, equity runs from -1 to +W.<br /><br />TP and CP here are the live cube take and cash points, so<br /><br />TP = (L-1/2)/(W+L+1/2)<br />CP = (L+1)/(W+L+1/2)<br /><br />Then the Janowski model equity is x * live cube equity + (1-x) * dead cube equity.<br /><br />That's it. Everything else falls out of that, and in practice you don't really need to calculate initial double, redouble, too good points, or even take and cash points - you just compare cubeful equities.<br /><br />For example, the take decision calculates the equity when the player owns the doubled cube and compares that to -(cube value). If the equity on take is greater than the equity on pass, you take.<br /><br />Similarly, the double decision calculates the equity on double (which is the minimum of the equity when the opponent owns the doubled cube and +(cube value) for the pass) and compares it to the equity at the current cube value when the player owns the cube. If the doubled equity is bigger, you double.<br /><br />My error before was around too-good points: in my implementation I assumed that the too-good point was calculated by comparing the dead cube equity with pass value, thinking that you're giving up the value of the cube by not doubling. But really you're not; you still have the option of doubling if the game starts to go against you. You should be comparing the model cubeful equity at the current cube against the value you'd get on a double (the minimum of the take and pass equities there).<br /><br />This invalidates some of the statistical results I calculated for Janowski's model before, so I'll have to redo those.<br /><br />Also it makes it easy to generalize to two cube life indexes: each player's interpolation of equity uses their appropriate index.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-56933424404934933612012-03-28T05:30:00.000-07:002012-03-28T05:30:29.233-07:00Jump model performance<div dir="ltr" style="text-align: left;" trbidi="on">I implemented a doubling strategy following <a href="http://arxiv.org/abs/1203.5692">my jump model</a> and ran it against the optimal Janowski doubling strategy. All checker play was <a href="http://compgammon.blogspot.com/2012/03/player-33-td-then-sl.html">Player 3.3</a>.<br /><br />This first implementation assumes a constant jump volatility, so does not make any attempt to calculate a local jump volatility in the game state. It's meant to be a benchmark to compare against the Janowski strategy with a single cube life index, which should be pretty similar in performance.<br /><br />I ran one million cubeless money games of the jump model with different jump volatilities, vs the Janowski strategy with the <a href="http://compgammon.blogspot.com/2012/03/optimal-janowski-cube-life-index.html">optimal</a> single cube life index of 0.7. The goal was to see which jump volatility value would give the best performance.<br /><br />In the paper I made an estimate of a constant jump volatility of 7.7% based on a calculation of the average absolute probability change from turn to turn when the cubeless win probability was near the take and cash points. That was only an approximate calculation, based on crude windows of win probability, but should be roughly what comes out of this more careful analysis.<br /><br />Here are the results of the games:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-S47i_szLjRI/T3MC6JQ3YjI/AAAAAAAAAds/_7eIQ15KEVg/s1600/jumpperf.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="435" src="http://1.bp.blogspot.com/-S47i_szLjRI/T3MC6JQ3YjI/AAAAAAAAAds/_7eIQ15KEVg/s640/jumpperf.jpg" width="640" /></a></div><br />The blue line/points represent the self-play results for average points per game for the different jump volatility inputs, running from 5% to 11%. Each has an error bar of +/- 0.004ppg which represents the standard error on the averages. The black line shows a parabolic fit through the points. The red line/points represent the probability of win for the jump model strategy.<br /><br />The jump volatility with the best ppg performance was 8.0%, quite close to the naive estimate from cubeless play statistics of 7.7%. But a jump volatility anywhere in the range 7.0% to 9.0% performed quite comparably. I fit a parabola to the ppg data and that suggests an optimal jump volatility of 8.5%, which is probably a more realistic number given the statistical uncertainty on individual points.<br /><br />The jump model with constant jump volatility performed marginally better than the Janowski model with a single cube life index, though the margins were small: a few hundredths of a point per game, in games with an average cube value of around 2.3. The differences are statistically significant (the standard error for all ppg numbers is +/- 0.004ppg) but only because I ran 1M games; really the performance is practically indistinguishable.<br /><br />That is roughly what one would expect given that the model with a constant jump volatility and constant conditional win and loss amounts W and L can be mapped directly onto Janowski's model with a single cube life index. Of course W and L are not actually constant, so the models are not practically the same, but they should be close.<br /><br />That said, the probability of win showed a much larger effect than points per game. A jump model strategy with a jump volatility of 5% won 50.2% of its games (while losing 0.006ppg in score); a strategy with jump volatility of 11% won only 44.2% of its games (but won 0.011ppg in score).<br /><br />High jump volatility means the player chooses to double at lower probability of win and pass at higher probability of win. So the games it loses because of passing a double are offset by games it wins when doubled.<br /><br /><br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-542770126895125822012-03-26T18:18:00.003-07:002012-03-27T19:16:30.724-07:00New cube decision model posted on arxiv.org<div dir="ltr" style="text-align: left;" trbidi="on">I posted the next draft of my new model for calculating cube decision points to the preprint site arxiv.org, so now it has a proper stable <a href="http://arxiv.org/abs/1203.5692">link</a>:<br /><br /><div style="text-align: center;"><span class="Apple-style-span" style="font-size: x-large;">Cube Handling In Backgammon Money Games Under a Jump Model</span></div><br /><div style="text-align: center;"><span class="Apple-style-span" style="font-size: large;">Abstract</span></div><br />A variation on Janowski’s cubeful equity model is proposed for cube handling in backgammon money games. Instead of approximating the cubeful take point as an interpolation between the dead and live cube limits, a new model is developed where the cubeless probability of win evolves through a series of random jumps instead of continuous diffusion. Each jump is drawn from a distribution with zero mean and an expected absolute jump size called the “jump volatility” that can be a function of game state but is assumed to be small compared to the market window.<br /><br />Simple closed form approximations for the market window and doubling points are developed as a function of local and average jump volatility. The local jump volatility can be calculated for specific game states, leading to crisper doubling decisions. <br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-42492293138571506932012-03-26T09:10:00.000-07:002012-03-26T09:12:37.058-07:00Checkpoint: recent progress and next steps<div dir="ltr" style="text-align: left;" trbidi="on">Since my <a href="http://compgammon.blogspot.com/2012/01/checkpoint-recent-progress-and-next.html">last checkpoint</a> around 2m ago I've made quite a lot of progress.<br /><br />Main highlights:<br /><ul style="text-align: left;"><li>Networks: I added a <a href="http://compgammon.blogspot.com/2012/02/player-32-adding-crashed-network.html">crashed network</a> following the GNUbg definition of "crashed".</li><li>Network inputs: I added a <a href="http://compgammon.blogspot.com/2012/01/player-24-berliner-primes-inputs.html">new input</a> to the contact and crashed networks that measures strength of prime blocking. I also <a href="http://compgammon.blogspot.com/2012/02/player-31-expanded-race-inputs.html">extended</a> the race inputs to more granularly track the checkers born in.</li><li>Training: I grabbed the <a href="http://compgammon.blogspot.com/2012/02/supervised-learning-using-gnubg.html">GNUbg training databases</a> for contact, crashed, and race networks, and ran supervised learning against them. The <a href="http://compgammon.blogspot.com/2012/03/player-33-td-then-sl.html">best performance</a> is first trained using TD learning, then uses those network weights as the starting point for supervised learning.</li><li>Benchmarking: I moved from bot self-play performance to benchmarking against the <a href="http://compgammon.blogspot.com/2012/02/better-benchmark-gnubg-benchmark.html">GNUbg benchmark databases</a> for cubeless play, since that gives a more accurate and robust set of benchmarks for different game phases. I ran some <a href="http://compgammon.blogspot.com/2012/02/bug-in-benchmark-calculation-fixed.html">stats</a> to look at the relationship between self-play results and GNU benchmark scores.</li><li>Cube handling: this was a new area for me since my personal game is weak on cube play. I learned about <a href="http://compgammon.blogspot.com/2012/03/doubling-cube-and-money-games.html">Janowski's classic model</a> for money games, as well as how to construct <a href="http://compgammon.blogspot.com/2012/03/full-match-equity-tables.html">match equity tables</a> for tournament play.</li><li>New model for money game cube action: I developed a <a href="http://compgammon.blogspot.com/2012/03/new-model-paper-v2.html">new model</a> for cube decisions in money games, moving from the "live cube" limit where probability of win is assumed to diffuse, to a jump model where the probability of win jumps each turn.</li></ul><div>Along the way I developed several new players incorporating the new ideas. As of now my best checker player is <a href="http://compgammon.blogspot.com/2012/03/player-33-td-then-sl.html">Player 3.3</a>, which scores GNUbg benchmark scores of Contact 13.3, Crashed 12.0, and Race 0.93 (compared to GNUbg <a href="http://compgammon.blogspot.com/2012/02/gnubg-0-ply-benchmarks-reference.html">0-ply scores</a> 10.5, 11.0, and 1.01).</div><div><br /></div><div>Stuff still to do:</div><div><ul style="text-align: left;"><li>Money play: implement my new model properly and test out its performance. Hopefully it should perform better than the standard approaches.</li><li>Match play: implement matches properly in my backgammon framework and see if I can extend my new model approach to match play as well (coming up with better match equity tables along the way).</li><li>Checker play: my players' Contact benchmark scores are still pretty weak compared to GNUbg. I suspect I need to add more custom inputs. I also want to investigate some more advanced supervised learning techniques applied to training against the GNUbg training databases.</li><li>Rollouts: rollouts are still pretty basic in my framework; I need to properly implement variance reduction and cutting off rollouts before the end of the game.</li><li>FIBS hookup: I still want to hook up my bot to FIBS and see how it performs there in real-world games.</li><li>Networks: I've been talking with Øystein Johansen about maybe breaking up the "contact" specification into separate nets for different game phases, perhaps using k-mean clustering to define the phases.</li></ul><div>Fun stuff!<br /><br /></div></div></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-14148522649233853692012-03-24T14:27:00.001-07:002012-04-06T17:02:47.891-07:00New model paper v2<div dir="ltr" style="text-align: left;" trbidi="on"><span class="Apple-style-span" style="font-family: Helvetica;">Here's the <a href="http://dl.dropbox.com/u/57178206/bgcubejumps2.pdf">second draft of the paper</a>, revamped considerably:</span><br /><div style="font-family: Helvetica;"><br /></div><div style="font-family: Helvetica;">The main changes:</div><div style="font-family: Helvetica;"><div><div><ul style="text-align: left;"><li>Added local and average jump volatility. Take and cash points depend only on the average volatility since on a pass/take decision you're comparing post-double equity with -1. The post-double equity depends on the jump volatility at the cash point; but should the game turn and head back up there it'll probably be in quite a different state than it's in near the take point. So using the average jump volatility seems more reasonable. Local jump volatility impacts the cubeful equity at the current cube level, which in turn impacts doubling decisions.</li><li>I note that if I assume a constant (single) jump volatility, it maps very nicely onto Janowski's model with one x. But the local + average jump volatility version does not map as nicely onto his model extension to a cube life index for each player, x1 and x2.</li><li>I changed the definition of jump volatility to be the expected absolute jump size instead of jump standard deviation. That's what the equity calcs care about. In fact, I only need to assume the jump distribution is symmetric for positive and negative jumps; if that is true then I don't need to know anything more about the jump distribution than its expected absolute jump size. Pretty nice! I got caught up for a while playing around with different distributional assumptions before I cottoned onto that.</li><li>I rewrote Appendix 3, the one that works through the details of the jump model, to try to make it more clear when you should use local vs average volatility. That I suspect is the subtlest part of this. I think I understand this properly now, but I went through a couple of iterations myself before I got here, so it could be that I'm still getting something wrong.</li></ul></div></div></div><div style="font-family: Helvetica;">This is on the back of a very rewarding conversation with Rick Janowski, who pointed out how I needed to rethink this in terms of something more like his model where he has a different cube life index for both players.</div><div style="font-family: Helvetica;"><br /></div><div><br /></div></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-75144922925917905332012-03-21T20:53:00.001-07:002012-03-21T20:53:32.629-07:00Optimal Janowski cube life index<div dir="ltr" style="text-align: left;" trbidi="on">I tried to determine the optimal constant Janowski cube life index by playing players with different cube life indexes against each other.<br /><div><br /></div><div>I started broadly by playing different cube life indexes against a fixed 0.7, since 0.7 is near where other people have quoted the cube life index.</div><div><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span><br /><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th>Statistic</th><th>x=0</th><th>x=0.25</th><th>x=0.50</th><th>x=0.75</th><th>x=1</th></tr><tr><td>Average points per game</td><td>-0.30</td><td>-0.14</td><td>-0.04</td><td>0.00</td><td>-0.09</td></tr><tr><td>Probability of win</td><td>37.5</td><td>41.8</td><td>45.9</td><td>51.2</td><td>58.3</td></tr><tr><td>Average cube</td><td>2.32</td><td>2.40</td><td>2.27</td><td>1.96</td><td>1.45</td></tr></tbody></table><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span></div><br />Interesting that the probability of win was well above 50% but on average lost points for x=1. It must be because when it wins it wins small, but when it loses it loses big. Presumably that's because the player gets doubled and takes when the game goes against it, but only doubles past the cash point when the game goes in the player's favor.<br /><br />Next I tried to zero in on the winning range by running the same thing in a tighter range:<br /><br /><div><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Statistic</div></th><th><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">x=0.55</div></th><th><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">x=0.60</div></th><th><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">x=0.65</div></th><th><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">x=0.70</div></th><th><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">x=0.75</div></th></tr><tr><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Average points per game</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">-0.03</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">-0.01</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">0.00</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">0.00</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">0.00</div></td></tr><tr><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Probability of win</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">46.9</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">47.8</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">48.9</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">50.0</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">51.2</div></td></tr><tr><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Average cube</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">2.23</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">2.17</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">2.11</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">2.04</div></td><td><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">1.96</div></td></tr></tbody></table><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span></div></div><div></div><br />So it seems like 0.7 is indeed the optimal cube life index in terms of expected points per game, at least within +/- 0.05.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-24348217343478237942012-03-21T18:18:00.001-07:002012-03-21T18:18:28.727-07:00Janowski-model money game cube statistics<div dir="ltr" style="text-align: left;" trbidi="on">I've got a working doubling strategy in my backgammon framework now for money games, applying <a href="http://compgammon.blogspot.com/2012/03/doubling-cube-and-money-games.html">Janowski's model</a>.<br /><br />Now I can calculate some statistics on the values of the cube; kind of like <a href="http://www.bkgm.com/rgb/rgb.cgi?view+674">an analysis</a> on Jellyfish doubling that was done in 1999.<br /><br />I'm interested in how doubling statistics change as Janowski's cube life index (sometimes called cube efficiency) varies. I ran 100k cubeful money games, using <a href="http://compgammon.blogspot.com/2012/03/player-33-td-then-sl.html">Player 3.3</a> for checker play, and Janowski's doubling model for different cube life index values.<br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span><br /><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th>Statistic</th><th>x=0</th><th>x=0.25</th><th>x=0.50</th><th>x=0.75</th><th>x=1</th></tr><tr><td>Percent cashed</td><td>21.8</td><td>27.4</td><td>40.0</td><td>50.9</td><td>96.3</td></tr><tr><td>Percent single</td><td>55.4</td><td>51.2</td><td>45.9</td><td>34.8</td><td>0.4</td></tr><tr><td>Percent gammon</td><td>21.8</td><td>20.3</td><td>18.1</td><td>13.6</td><td>3.1</td></tr><tr><td>Percent backgammon</td><td>1.2</td><td>1.1</td><td>1.0</td><td>0.7</td><td>0.2</td></tr><tr><td>Average cube</td><td>14.7</td><td>4.49</td><td>2.72</td><td>1.89</td><td>1</td></tr><tr><td>Percent cube=1</td><td>0.6</td><td>3.2</td><td>12.3</td><td>33.9</td><td>100</td></tr><tr><td>Percent cube=2</td><td>24.2</td><td>47.8</td><td>61.0</td><td>56.7</td><td>0</td></tr><tr><td>Percent cube=4</td><td>24.4</td><td>31.3</td><td>21.2</td><td>8.5</td><td>0</td></tr><tr><td>Percent cube=8</td><td>18.6</td><td>12.2</td><td>4.5</td><td>0.8</td><td>0</td></tr><tr><td>Percent cube=16</td><td>12.8</td><td>4.1</td><td>0.8</td><td>0</td><td>0</td></tr><tr><td>Percent cube=32</td><td>8.2</td><td>1.1</td><td>0.1</td><td>0</td><td>0</td></tr><tr><td>Percent cube=64</td><td>11.1</td><td>0.4</td><td>0</td><td>0</td><td>0</td></tr></tbody></table><br />In the case of x=1 (the live cube limit) the initial double point is right at the cash point, so the player never offers a double when the opponent will take. Most games then end in cashes.<br /><br /><br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-11926268502383991412012-03-20T08:20:00.000-07:002012-03-20T08:27:50.012-07:00New model for money game cubeful equity<div dir="ltr" style="text-align: left;" trbidi="on">I came up with a <a href="http://dl.dropbox.com/u/57178206/bgcubejumps.pdf">new model</a> for estimating cubeful equity in backgammon money games that seems like an interesting way to think about the <a href="http://compgammon.blogspot.com/2012/03/doubling-cube-and-money-games.html">Janowski</a> cube life index in specific game states:<br /><br /><div style="text-align: center;"><span style="font-size: large;">Abstract</span></div><br />A variation on Janowski’s cubeful equity model is proposed for cube handling in backgammon money games. Instead of approximating the cubeful take point as an interpolation between the dead and live cube limits, a new model is developed where the cubeless probability of win evolves through a series of random jumps instead of continuous diffusion. Jumps occur when a Poisson process fires, and each jump is drawn from a normal distribution with zero mean and a standard deviation called the “jump volatility” that can be a function of win probability but is assumed to be small compared to the market window.<br /><br />Simple closed form approximations for the market window and doubling points are developed as a function of local jump volatility. The jump volatility can be calculated for specific game states, leading to crisper doubling decisions.<br /><br />All cube decision points under this model match Janowski’s if his cube life index is implied from this model’s take point, so these results can be viewed as a framework for estimating the Janowski cube life index more accurately for specific game states.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-40121252685720213682012-03-15T07:40:00.001-07:002012-03-15T07:58:17.129-07:00Match equity table with MWC<div dir="ltr" style="text-align: left;" trbidi="on">Most people show match-winning chance (MWC) in their match equity table instead of equity. Of course the standard conversion of equity = 2 MWC - 1 holds. Here are my calculated <a href="http://compgammon.blogspot.com/2012/03/full-match-equity-tables.html">match equity tables</a> showing MWC:<br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 18px;"></span><br /><div class="nobrtable"><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th><br /></th><th>1-away</th><th>2-away</th><th>3-away</th><th>4-away</th><th>5-away</th><th>6-away</th><th>7-away</th><th>8-away</th><th>9-away</th><th>10-away</th><th>11-away</th><th>12-away</th></tr><tr><td>1-away</td><td>50</td><td>68.1</td><td>75.5</td><td>81.7</td><td>84.5</td><td>89.0</td><td>91.0</td><td>93.5</td><td>94.6</td><td>96.1</td><td>96.8</td><td>97.7</td></tr><tr><td>2-away</td><td><br /></td><td>50</td><td>59.5</td><td>66.1</td><td>73.6</td><td>79.2</td><td>83.4</td><td>86.7</td><td>89.4</td><td>91.7</td><td>93.4</td><td>94.8</td></tr><tr><td>3-away</td><td><br /></td><td><br /></td><td>50</td><td>57.0</td><td>64.5</td><td>70.8</td><td>75.6</td><td>79.8</td><td>83.3</td><td>86.4</td><td>88.9</td><td>90.9</td></tr><tr><td>4-away</td><td><br /></td><td><br /></td><td><br /></td><td>50</td><td>57.4</td><td>63.9</td><td>69.2</td><td>73.7</td><td>77.9</td><td>81.5</td><td>84.5</td><td>87.0</td></tr><tr><td>5-away</td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td>50</td><td>56.6</td><td>62.2</td><td>67.2</td><td>71.8</td><td>75.9</td><td>79.4</td><td>82.5</td></tr><tr><td>6-away</td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td>50</td><td>55.9</td><td>61.1</td><td>66.0</td><td>70.4</td><td>74.4</td><td>77.8</td></tr><tr><td>7-away</td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td>50</td><td>55.2</td><td>60.4</td><td>65.0</td><td>69.2</td><td>73.0</td></tr><tr><td>8-away</td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td>50</td><td>55.2</td><td>59.9</td><td>64.3</td><td>68.3</td></tr><tr><td>9-away</td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td>50</td><td>54.8</td><td>59.4</td><td>63.6</td></tr><tr><td>10-away</td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td>50</td><td>54.6</td><td>58.9</td></tr><tr><td>11-away</td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td>50</td><td>54.3</td></tr><tr><td>12-away</td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td><br /></td><td>50</td></tr></tbody></table></div><br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;">I left out the bottom half of the table for readability, but all values there are just 100 less the corresponding value in the top half.</span><br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"><br /></span><br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;">And here are the corresponding post-Crawford MWCs (for player n-away, opponent 1-away):</span><br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span><br /><div class="nobrtable"><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th><br /></th><th>1-away</th><th>2-away</th><th>3-away</th><th>4-away</th><th>5-away</th><th>6-away</th><th>7-away</th><th>8-away</th><th>9-away</th><th>10-away</th><th>11-away</th><th>12-away</th></tr><tr><td>1-away</td><td>50</td><td>48.6</td><td>31.9</td><td>30.6</td><td>18.7</td><td>17.7</td><td>11.3</td><td>10.6</td><td>6.7</td><td>6.3</td><td>4.0</td><td>3.7</td></tr></tbody></table></div><div class="nobrtable"><br /></div><div class="nobrtable">These compare well to the "Mec26" MWC table <a href="http://www.bkgm.com/articles/Silver/Mec26.html">here</a>, mostly within 0.2%. Those are based on GNUbg rollouts for post-Crawford MWCs and using a gammon probability of 26.0% (vs 27.6% in mine). The biggest differences are in 1-away,2-away and 1-away,3-away where the MWC is different by about 0.5%.</div><div class="nobrtable"><br />Another modern MET is the <a href="http://apbg.net/matches/g11_met.pdf">G11 table</a>. Again, not much difference between my calculated MET and this one, though bigger differences (especially near the diagonal) than with Mec26.<br /><br /></div><div class="nobrtable"><br /></div><div class="nobrtable"><br /></div></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0tag:blogger.com,1999:blog-4837627029742618106.post-80309486053326991042012-03-14T07:37:00.000-07:002012-03-14T07:37:13.052-07:00Full match equity tables<div dir="ltr" style="text-align: left;" trbidi="on">The next step after getting the <a href="http://compgammon.blogspot.com/2012/03/match-equity-table-crawford-games.html">Crawford-game match equities</a> is to build out the full match equity tables for pre-Crawford games.<br /><br />I followed the algorithm defined <a href="http://www.bkgm.com/articles/met.html">here</a>, by Tom Keith, but there were a few assumptions I didn't fully understand. More on that later. Here is my calculated pre-Crawford match equity table, out to 12-away:<br /><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span></div><div class="nobrtable"><span class="Apple-style-span" style="font-size: xx-small;"><br /></span><br /><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th><span class="Apple-style-span" style="font-size: xx-small;"><br /></span></th><th><span class="Apple-style-span" style="font-size: xx-small;">1-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">2-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">3-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">4-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">5-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">6-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">7-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">8-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">9-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">10-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">11-away</span></th><th><span class="Apple-style-span" style="font-size: xx-small;">12-away</span></th></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">1-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.362</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.510</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.635</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.690</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.780</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.820</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.869</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.892</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.922</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.936</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.953</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">2-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.362</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.189</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.322</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.472</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.585</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.667</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.733</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.788</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.834</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.867</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.896</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">3-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.510</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.189</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.139</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.289</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.415</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.513</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.595</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.667</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.729</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.777</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.819</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">4-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.635</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.322</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.139</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.149</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.279</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.385</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.474</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.558</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.630</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.690</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.741</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">5-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.690</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.472</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.289</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.149</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.131</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.245</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.343</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.436</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.518</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.588</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.649</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">6-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.780</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.585</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.415</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.279</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.131</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.118</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.221</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.320</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.409</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.488</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.557</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">7-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.820</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.667</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.513</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.385</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.245</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.118</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.105</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.208</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.300</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.385</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.460</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">8-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.869</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.733</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.595</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.474</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.343</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.221</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.105</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.103</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.198</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.287</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.367</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">9-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.892</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.788</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.667</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.558</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.436</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.320</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.208</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.103</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.096</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.188</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.271</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">10-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.922</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.834</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.729</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.630</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.518</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.409</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.300</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.198</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.096</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.092</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.178</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">11-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.936</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.867</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.777</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.690</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.588</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.488</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.385</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.287</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.188</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.092</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0.087</span></td></tr><tr><td><span class="Apple-style-span" style="font-size: xx-small;">12-away</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.953</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.896</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.819</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.741</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.649</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.557</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.460</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.367</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.271</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.178</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">-0.087</span></td><td><span class="Apple-style-span" style="font-size: xx-small;">0</span></td></tr></tbody></table></div><span class="Apple-style-span" style="font-size: xx-small;"><br /></span><br />And here are the corresponding post-Crawford match equities, recalculated using 2-ply <a href="http://compgammon.blogspot.com/2012/03/player-33-td-then-sl.html">Player 3.3</a> to get state probabilities in the first two rolls.<br /><span class="Apple-style-span" style="color: #222222; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 13px; line-height: 18px;"></span><br /><div class="nobrtable"><table border="2" bordercolor="#3366cc" cellpadding="3" cellspacing="3" style="background-color: white;"><tbody><tr style="background-color: #3366cc; color: white; padding-bottom: 4px; padding-top: 5px;"><th><br /></th><th>1-away</th><th>2-away</th><th>3-away</th><th>4-away</th><th>5-away</th><th>6-away</th><th>7-away</th><th>8-away</th><th>9-away</th><th>10-away</th><th>11-away</th><th>12-away</th></tr><tr><td>1-away</td><td>0</td><td>0.028</td><td>0.362</td><td>0.388</td><td>0.626</td><td>0.645</td><td>0.774</td><td>0.788</td><td>0.865</td><td>0.874</td><td>0.919</td><td>0.925</td></tr></tbody></table></div><br />I calculated the average probability of win, gammon win, and backgammon win, and corrected all player-calculated probabilities so that the average probability of win was 50% and the average probability of gammon win/loss and backgammon win/loss were symmetric.<br /><br />The probability of the game ending in gammon using these adjusted state game probabilities is 27.60%. The probability of the game ending in backgammon is 1.1%.<br /><br />The approach:<br /><br /><ul style="text-align: left;"><li>Define M(n,m) as the match equity for player n-away and opponent m-away, for symmetric players before the first dice are thrown in a game (so both players have 50% chance of a win and the cube is at 1). By convention I'll assume n<m but M(m,n)=-M(n,m) so this isn't an issue.</li><li>Define Mp(n,m,c,X,Pw) as the match equity for player n-away, opponent m-away, cube value c, owned by player X (either P for player, O for opponent, or C for centered when c=1), and probability of any player win Pw. We'll assume this is piecewise linear in Pw in the range [take point,cash point] for centered cube, [take point,1] for opponent-owned cube, and [0,cash point] for player-owned cube.</li><li>The match equity table value M(n,m) = Mp(n,m,1,C,0.5), so if we can get Mp functions we're done.</li><li>We start with the cube at the maximum value, so that c is the smallest value that's greater than m (remember we assume m>n by convention). In this state the player owns the cube, since only the opponent can double to this level. No more doubling can happen, and any win or loss is a match win or loss, so Mp(n,m,cmax,P)=2 Pw - 1.</li><li>Then we step back to the cube 2x smaller than the maximum. For the case where the player owns the cube we calculate the cash point, where the opponent's equity in the maximum cube state equals their equity on a pass (which uses the match equity table for the corresponding away values which we bootstrap). We assume that the player does not double until the cash point, and so M(n,m,c,P) is linear from Pw=0 to Pw=cash point. We know the cash point value from the opponent's take/pass decision, and we know the match equity for a player loss at this cube value (which also uses the appropriate values from the match equity table for M(n,smaller m values)).</li><li>For the case when the opponent owns the cube, we assume the opponent doubles at the player's take point, and that the opponent's equity is piecewise linear in Pw from the take point to Pw=1 (where we know the opponent's winning value from the match equity table lookup).</li><li>We continue to step back to cube=1, where we assume that match equity is linear between the player's take point and cash point. This gives us Mp(n,m,1,C,Pw), which we use to calculate M(n,m) = Mp(n,m,1,C,0.5).</li></ul><div>This follows Tom Keith's algorithm and returns the match equity table displayed above, using the probability of gammon given above.</div><div><br /></div><div>The main assumption here is that the player and opponent double only when they are at the cash point or take point. This seems a bit unusual - really you'd double earlier than the cash point, and the opponent would double you earlier than your take point.</div><div><br /></div><div>I tried an alternative approach, similar to the "live cube" model in Janowski: the probability of win diffuses like a Brownian motion with constant volatility sigma. Then I can treat the doubling problem as a derivatives pricing problem, using numerical backward induction to get the fair value. Given a cube value I know the match equity at Pw=0 and Pw=1, and also that the match equity (given a cube value) satisfies the partial differential equation sigma^2/2 d^2(match equity)/dPw^2 + d(match equity)/dt = 0. You solve this PDE separately for different "layers" corresponding to different cube values and cube ownership, and at each time step mix the values across layers based on doubling decisions.</div><div><br /></div>This converged to exactly the values described by the Keith algorithm above. So the Keith algorithm is equivalent to a live cube model where probabilities diffuse continuously rather than jump. This makes me wonder whether a more accurate match equity table could mix between this limit and something like the dead cube limit, as with the money cube algorithm described by Janowski.<br /><br /></div>Mark Higginshttps://plus.google.com/103550650301831037290noreply@blogger.com0