I took that set of 25,000-ish checker images and trained a multi-layer perceptron (scikit-learn's MLPClassifier) on them to see how it'd do.
Each image was reduced to a 16x16 grayscale image, and those 256 numbers were the inputs to the MLP. They were normalized by dividing by 255 and then subtracting 0.5, so the range was [-0.5,0.5] for each of the 256 inputs.
As discussed before, I used a 99% threshold for classifying an image as a white or black checker, to improve precision at the expense of recall.
I split the input data into training and test sets: the training set used 75% of randomly-sampled inputs, and the test set was the remaining 25%.
I judged the performance of the classifier by looking at a confusion table, plus using it to identify checker on a reference board.
This post gives some results for different topologies of the neural network. In all cases I used the "adam" solver (a stochastic gradient descent solver) and an L2 penalty parameter "alpha" equal to 1e-4.