I had one last bug with PubEval: when evaluating board value for proposed moves, it should use the race or contact network weights based on the starting board, not based on whether the proposed move boards are race or contact. That's because the race network and the contact network return quite different numerical values and you can't compare them against each other (unlike networks whose outputs represent probability of win etc).
I fixed that bug, and now I'm pretty sure I've got a proper implementation of the standard PubEval player. After those fixes, in 30k cubeless money matches, here are how my players perform:
Benchmark 1: +0.351ppg, 62.9% chance of win
Benchmark 2: +0.426ppg, 64.3% chance of win
Player 2: +0.435ppg, 64.6% chance of win
Player 2.1: +0.469ppg, 66.0% chance of win
Compare that to gnubg 0-ply, which scores +0.630ppg and wins 70.9% of games. So I've still got some work to do. :)
I fixed that bug, and now I'm pretty sure I've got a proper implementation of the standard PubEval player. After those fixes, in 30k cubeless money matches, here are how my players perform:
Benchmark 1: +0.351ppg, 62.9% chance of win
Benchmark 2: +0.426ppg, 64.3% chance of win
Player 2: +0.435ppg, 64.6% chance of win
Player 2.1: +0.469ppg, 66.0% chance of win
Compare that to gnubg 0-ply, which scores +0.630ppg and wins 70.9% of games. So I've still got some work to do. :)
No comments:
Post a Comment