I've been spending a little time looking at cases where my Player 3.4 does poorly in the GNUbg contact benchmarks database, to get some feel for what new inputs I might try.
It looks it's leaving blots too often when the opponent has a good prime blocking the way out of the home board.
So I tried two new inputs: the odds of entering the opponent's home board if there were a checker on the bar; and the odds of hitting an opponent blot in his home board if there were a checker on the bar.
I tried two training approaches: first, adding random weights for just those four new weights (the two inputs times two players) and doing supervised learning on the GNUbg training databases; and also starting from scratch, random weights everywhere, and doing TD training through self-play and then SL on the GNUbg training databases.
The conclusion: neither worked. In both cases the new player was about the same as or a little worse than Player 3.4. So these aren't the right new inputs to add.
Back to the drawing board.
It looks it's leaving blots too often when the opponent has a good prime blocking the way out of the home board.
So I tried two new inputs: the odds of entering the opponent's home board if there were a checker on the bar; and the odds of hitting an opponent blot in his home board if there were a checker on the bar.
I tried two training approaches: first, adding random weights for just those four new weights (the two inputs times two players) and doing supervised learning on the GNUbg training databases; and also starting from scratch, random weights everywhere, and doing TD training through self-play and then SL on the GNUbg training databases.
The conclusion: neither worked. In both cases the new player was about the same as or a little worse than Player 3.4. So these aren't the right new inputs to add.
Back to the drawing board.
No comments:
Post a Comment