I'm now starting to experiment with some inputs beyond the basic Tesauro ones. One I had already added: an input indicating whether it's no longer possible to lose a backgammon (for each player).
I tried a new one: the number of ways to hit a shot (one input from the perspective of each player). I added it to the Player 2 player, which has the race & contact networks plus bearoff.
I trained it for 460k runs, using alpha=0.02 for the first 200k then dropping to alpha=0.004. I started with a smaller alpha than usual because I used the same weights as the trained Player 2 weights, but started with a small random weight for the new input (uniform random between -0.1 and +0.1).
After training the player with the new input scored +0.045ppg against the player without the new input, in a 30k cubeless money games. So a noticeable improvement but nothing dramatic.
I'll call the version of Player 2 with the new input Player 2.1.
I tried a new one: the number of ways to hit a shot (one input from the perspective of each player). I added it to the Player 2 player, which has the race & contact networks plus bearoff.
I trained it for 460k runs, using alpha=0.02 for the first 200k then dropping to alpha=0.004. I started with a smaller alpha than usual because I used the same weights as the trained Player 2 weights, but started with a small random weight for the new input (uniform random between -0.1 and +0.1).
After training the player with the new input scored +0.045ppg against the player without the new input, in a 30k cubeless money games. So a noticeable improvement but nothing dramatic.
I'll call the version of Player 2 with the new input Player 2.1.
No comments:
Post a Comment