I started with the second step in my pipeline: how to identify the checkers on the board.
The approach I'm trying:
- Start with the raw image of the board, laid out so that it's wider than it is tall, so that the points are vertical.
- Define a small square window, something comparable to the checker size.
- Scan that window across the board. At each step, look at the part of the raw image that's in the window and decide whether it contains a white checker, a black checker, or no checker.
- Process the window image: downsample into a 16x16 grid of grayscale pixels.
- My metric for deciding whether a checker is in the image: one checker has > 75% of its area in the window, no other checkers have > 25% of their areas in the window, and the center of the window lands inside the one main checker.
- If a checker is counted as in the window, remember a point equal to the center of the window, tagged as white or black depending on which checker was found.
- Run that scan across the whole board, for a range of window sizes, and for a range of starting pixel positions for the scan, so it gets a lot of views of the board in slightly different ways.
At the end of this I end up with a bunch of dots, hopefully grouped around the actual checker positions. Then I need some way of identifying the groupings as checkers, but that's the next step.
I decided to train a multi-layer perceptron - a neural network - to identify the three categories for each window image: no checker, a black checker, or a white checker. For this I used scikit-learn's MLPClassifier, with the following assumptions:
- Adam solver (a stochastic gradient descent solver)
- Relu activation function for every node
- Two hidden layers with 100 nodes in each
- L2 penalty parameter "alpha" equal to 1e-4
In this use case, I care about the precision of the classifier - that is, the ability of the classifier not to label as positive a sample that is negative - more than the recall - the ability of the classifier to find all the true samples. That's because it does this grid scan, and should identify the same checker many times in its scans. If it misses a checker on a few scans (because it's focused on precision over recall), it should catch it on other scans.
To amp the precision, I count a window as having a white or black checker only if the classifier has > 99% probability. That means a bunch of real checkers will be missed and counted as false negatives, which reduces the recall, but the false positives should be quite low, which pumps up the precision.
Before I could start training, though, I needed some labeled input data to test against. That is, images of that are about the size of the window, where each is labeled as "has no clear checker", "has a white checker", and "has a black checker". Where to get these labeled inputs?
I took a picture of a few backgammon boards, and started by manually cropping out images and saving them, then manually labeling them. That was pretty slow - it took me a half hour just to get 100 or so.
Then I realized I could use my scanning approach. I took a picture of a board, then manually identified where the checkers were on the board: center point and radius for each, as well as whether it was white or black. I saved that identification data to a file, then I ran a scan over the board, and could use the manually-identified checker positions to figure out whether a window contained 75% of one checker, less than 25% of any other checker, and that the center of the window was inside the area of the one checker.
I did that with a handful of boards, and that generated about 2,400 black checker images, 2,600 white checker images, and 200,000 images with no checker. I discarded 90% of the "no checker" images just so the input data set wasn't too skewed towards that side, so ended up with about 80% of the examples being "no checker", and about 10% each in white and black checkers.
That served as my training set for the checker classifier.
No comments:
Post a Comment