Forecast Accuracy

Accuracy is best measured in terms of absolute percentage variance.

Compute the percentage difference from actual to estimate for all stores, disregarding sign (positive/negative) and then compute the average.

 

 

Obviously, the best measure of sales forecast accuracy is to run the system on a large number of sites and wait for them to mature, then compare actual sales with estimated sales. However, this is not particularly helpful in the short run when one is trying to get a handle on the likely efficacy of the forecasting system. Even this procedure can prove elusive when a number of unanticipated factors change the parameters assumed in at the time of the sales forecast:

  • changes in surrounding competition
  • macro economic upheavals: booms and busts
  • altered access features: new roads, construction interruptions, etc.
  • changes in co-tenancy

There are two fairly good processes for evaluating the accuracy of a sales forecasting system within a short time after the system has been developed.

 

Cross-Validation

A good check on the validity of the model is provided by cross-validating. The basic model is re-calibrated omitting each database store one-by-one and examining the changes in the significance and the weights of the variables. In a second round of tests each variable is omitted one-by-one to check the stability of the model. The favorable outcome of the cross-validation test is a strong indication of the model’s accuracy and stability.

Of course we look at cross validation samples during the discovery process and if results are unsatisfying we continue to employ new variables and measures to tweak the model. Sometimes the system is initially strong except in certain circumstances, so segmenting the database into distinct characteristics “buckets” for instance Location Type can be helpful. Below are some example from a recent project.

 

 

 

 

 

Holdout Sample

This involves holding out a set of existing stores from the model database. Care should be taken to avoid atypical stores or over-weighting to certain characteristics, i.e., make sure that the holdout sample is about as “representative” of your store base as the model database.