Uh oh - I discovered a bug deep in the guts of the backgammon engine, in the piece that determines the available moves given a roll. I found it when running my trained network against itself for a 1,000 game benchmark - and found that player 1 would consistently lose 0.3ppg against player 2, even though they were the same strategy. It didn't matter if the strategy was a neural net or pubEval - either way it was broken.
Fixed now, but it does invalidate all my earlier results.
I suspect the really strong performance (stronger than the published TD gammon results) I recorded against pubEval was an accidental consequence of this. I'm re-running the training now for the two-output network (10, 20, 40, and 80 hidden node versions) and we'll see how those come out.
Note: I discovered much later that I had a few significant bugs. The most relevant here was in my implementation of pubEval - after fixing that it was a much stronger player.
Fixed now, but it does invalidate all my earlier results.
I suspect the really strong performance (stronger than the published TD gammon results) I recorded against pubEval was an accidental consequence of this. I'm re-running the training now for the two-output network (10, 20, 40, and 80 hidden node versions) and we'll see how those come out.
Note: I discovered much later that I had a few significant bugs. The most relevant here was in my implementation of pubEval - after fixing that it was a much stronger player.
No comments:
Post a Comment