Jan 11, 2012 by Cliff DeJong
NOTE: This article was written at the start of the 2012 NASCAR season to provide a better understanding of the algorithm used in our AccuPredict - NASCAR Driver Finish Predictions tool. The basis of the formula remains as described in this article, although some tweaks to the weighting of the formulas are made by Cliff after each NASCAR season.
One way to decrease the time interval for measurement of a driver's performance is to look at performance at similar tracks.
For example, Jeff Gordon always does well at flat tracks, such as Martinsville and Loudon. There are several races each year at flat tracks, so averaging over the past eight races at flat tracks would only cover races during the past year and therefore would reflect more recent performance, rather than requiring several years' performance at an individual track.
Tracks can be grouped by several means: until now, the most accurate grouping that I have seen is illustrated below in the Figure 7 table.
|Flat Tracks||Shallow Tracks||Steep Tracks||Cookie Cutter Tracks||Road Course||Restrictor Plate||ODD 1||ODD 2|
|FIGURE 7 - TRADITIONAL TRACK GROUPINGS|
|New Hampshire||Pocono||Darlington||Charlotte||Watkins Glen||Talladega||Kansas||Michigan|
This grouping is based on track physical similarities of track length and corner banking, and was made by Christopher Harris of ESPN in 2007. The ODD tracks are those that do not fit nicely into the other categories, but have some similarities to others in the same ODD listing.
The theory for similar track groupings is that a driver's performance at all alike tracks will be consistent.
To evaluate this, I looked at the 1991-2011 database and correlated each driver's finish to his average performance at previous races at tracks in the same grouping.
|Flat||Shallow||Steep||Cookie Cutter||Road Course||Restrictor Plate||ODD 1||ODD 2|
|FIGURE 8 - TRADITIONAL TRACK GROUPINGS (click each chart for full-size)|
For example, I looked at Bristol finishes for each driver against the average finish over the last N races at any steep track. This may include previous races at Bristol. When these averages for Bristol based on steep tracks are plotted against averages over all steep tracks, you would expect similar shaped curves if Bristol is properly classified as a steep track and the supposition of similar performance holds.
The table in figure 8 shows the results for all eight groupings of tracks.
Note also that some of the tracks, such as Chicago, Watkins Glen and the plate tracks are poorly correlated to other tracks in their categories. Most of the curves in general show their best correlation for an average taken over the past eight races of tracks in the same category.
Some tracks were regrouped because of these results.
It was found that including the ODD tracks with the Cookie Cutter Tracks was advantageous. This revised category was named Large Ovals and its performance is shown in Figure 9.
Again, the average over the last eight races is a good measure of performance.
This shows a fairly tight clustering of the curves and improved correlations for each of the member tracks with average finishes over the other tracks in the Large Ovals category.
The exception is still Kentucky, based on only one race, and therefore not a concern.
The correlations for these tracks in the new Large Oval category are shown in Figure 10 for their previous grouping of track types and the revised category of Large Oval.
Results are for eight races averaged.
All tracks in this new category perform better and Chicago's correlation is much improved and is now on a par with other tracks in this category.
Michigan and Atlanta are significantly improved as well.
Indianapolis and Pocono, grouped as Shallow Tracks, offered only three races per year, and therefore were grouped into the Flat Tracks category.
Results are given in Figure 11 for eight-race averages, and show improvement in all Flat Track correlations.
Pocono in particular is better correlated with this new grouping of Flat Tracks.
There are no obvious re-classifications for Road Course or Restrictor Plate Tracks.
When the four of them were grouped together as an excursion, the correlations were improved for Plate Tracks, but the Road Course results were worse than the original groupings.
There does not appear to be any justification for regrouping these tracks.
|Flat Tracks||Steep Tracks||Large Oval Tracks||Road Course||Restrictor Plate|
|FIGURE 12 - Revised Similar Track Groups|
In Figure 13 are the correlation averages for the revised track types.
Cliff DeJong (pronounced De Young), the man behind AccuPredict, is a research scientist who has been crunching numbers his entire life. An avid NASCAR fan, Cliff was introduced to fantasy NASCAR by his brother (who beat him at just about everything).
Cliff put his Carnegie Mellon Computer Science degree and Iowa State University Mathematics degree to use creating successful methods to predict each Cup race based on NASCAR statistics.
It is an obsession that has consumed untold hours.
Cliff would love to hear your comments, questions and suggestions at moc.liamg@tciderpucca