Friday, April 6, 2012

Player 3.4: incrementally better

I let the supervised learning algorithm train for a while longer, starting with the Player 3.3 networks (i.e. the same networks and inputs).

Its benchmark scores were Contact 13.3, Crashed 11.7, and Race 0.766, so my best player yet, but only incrementally better than Player 3.3's scores of 13.3, 12.0, and 0.93.

On the most important benchmark, Contact, it was unchanged. And in self-play against Player 3.3 (100k games with variance reduction) its score was zero within a standard error of 0.0009ppg. Not exactly a startling improvement!

Nonetheless it is measurably better in the other GNUbg benchmarks so I'll start using it as my best player.

Playing 100k cubeless money games against PubEval it scored +0.586ppg +/- 0.001ppg, winning 69.22% of the games; Player 3.3 in the same games scores +0.585ppg +/- 0.001ppg and wins 69.20% of the games. So again very little difference, but still incrementally my best player.

No comments:

Post a Comment