Turns out that the first innovation I mentioned in the last post - training on the conditional probability of gammon loss in the case that the game ends in a gammon - is irrelevant.
That's because the training only ever ends a game on a win for the player, since the winner is always the last side to play in a game.
As part of playing with this, though, I noticed that the probability of a gammon win from the starting board changes quite a bit during the training: anywhere from 10% to 40%. And the network does not recognize that the probability of a gammon is zero if the other player has borne in any checkers. That's one innovation that should help things: force the bot to notice that game fact.
Also, the probability of win from the starting board is exactly 50%. That is by construction, since I require that flipping the board perspective must give the same value. Of course, the probability of win given that a player has the first move is a bit more than 50%, and that should come out of this.
The right thing here, I think, is to add a bias node to the probability of win. That changes the symmetry relationship so that the probability of win is not always 50%.
That's because the training only ever ends a game on a win for the player, since the winner is always the last side to play in a game.
As part of playing with this, though, I noticed that the probability of a gammon win from the starting board changes quite a bit during the training: anywhere from 10% to 40%. And the network does not recognize that the probability of a gammon is zero if the other player has borne in any checkers. That's one innovation that should help things: force the bot to notice that game fact.
Also, the probability of win from the starting board is exactly 50%. That is by construction, since I require that flipping the board perspective must give the same value. Of course, the probability of win given that a player has the first move is a bit more than 50%, and that should come out of this.
The right thing here, I think, is to add a bias node to the probability of win. That changes the symmetry relationship so that the probability of win is not always 50%.
No comments:
Post a Comment