Friday, February 17, 2012

Bug in benchmark calculation: fixed

I just discovered a bug in my benchmark calculation: if the player's board was not found in the list of best-five rolled out boards in the benchmarks, it was assuming an equity error of zero instead of the worst equity error in the list. So it was making those edge cases look much better than they should have, and skewing the average ERs a bit better.

Below is a table of the corrected benchmark results for the nine players in the original results, plus four new ones:

Player
GNUbg Contact ER
GNUbg Crashed ER
GNUgb Race ER
PubEval Avg Ppg
Benchmark 2 Avg Ppg
10.5
11.0
1.01
-
14.0
12.6
2.08
0.547
0.131
33.7
26.8
2.40
0.146
-0.282
14.9
14.2
2.08
0.548
0.106
14.9
14.2
2.67
0.550
0.108
18.2
19.3
1.98
0.480
0.072
38.3
41.1
4.70
0.119
-0.283
18.7
19.8
2.01
0.460
0.069
20.5
30.0
2.09
0.442
0.021
21.5
23.7
5.44
0.432
0
Benchmark 2 (10)
42.7
37.5
13.20
0.064
-0.418
Benchmark 2 (40)
26.2
25.9
5.94
0.330
-0.067
23.0
24.5
9.66
0.351
-0.101
PubEval
44.1
49.7
3.54
0
-0.437


Redoing the one-variable regressions on the corrected & expanded data:


MetricPubEval Ppg vs Contact ERPubEval Ppg vs Crashed ERPubEval Ppg vs Race ERBM2 Ppg vs Contact ERBM2 Ppg vs Crashed ERBM2 Ppg vs Race ER
Slope-0.0182-0.0161-0.0268-0.0189-0.0165-0.0275
Intercept0.80650.76380.46310.39690.34680.0362
R-Squared98.8%83.5%22.5%98.0%80.6%22.3%


Redoing the multivariate linear regression:

Benchmark
Intercept
Contact ER Slope
Crashed ER Slope
Race ER Slope
R-Squared
PubEval
0.8037
-0.01980
+0.00133
+0.00202
98.9%
Benchmark 2
0.3945
-0.02049
+0.00209
-0.00246
98.4%
Benchmark 2 (only good players)
0.4102
-0.01756
-0.00063
-0.00694
98.5%

The main conclusion: Contact ER mostly determines cubeless money play score. The single regression of score against Crashed ER has a relatively high R^2, but that is only because Crashed ER is highly correlated with Contact ER. When properly separated with the multivariate regression it becomes clear that Contact ER is the only measure that really matters.

No comments:

Post a Comment