Predictive analytics is an area of statistical analysis that deals with extracting information from data and using it to predict future trends and behavior patterns. The core of predictive analytics relies on capturing relationships between explanatory variables and the predicted variables from past occurrences, and exploiting it to predict future outcomes. It is important to note, however, that the accuracy and usability of results will depend greatly on the level of business and data understanding of the user. Site Perfect Research has the know-how to leverage several different statistical techniques to fit the location and forecasting challenges you face.
Multivariate Regression Modeling
Regression models are the mainstay of predictive analytics. The focus lies on establishing a mathematical equation as a model to represent the interactions between the different variables in consideration.
Linear regression models analyze the relationship between the dependent variable (e.g., sales volumes) and a set of independent or predictor variables (e.g., demographics, lifestyle categories, competitive impacts, site factors, etc.). This relationship is expressed as an equation that predicts the response variable as a linear function of the parameters. These parameters are adjusted so that a measure of fit is optimized. Much of the effort in model fitting is focused on minimizing the size of the residual, as well as ensuring that it is randomly distributed with respect to the model predictions.
CHAID models
CHAID is a type of decision tree technique, based upon adjusted significance testing. CHAID can be used for prediction (in a similar fashion to regression analysis) as well as classification, and for detection of interaction between variables. CHAID stands for CHi-squared Automatic Interaction Detector.
In practice, CHAID is often used in the context of direct marketing to select groups of consumers and predict how their responses to some variables affect other variables. Like other decision trees, CHAID’s advantages are that its output is highly visual and easy to interpret. Because it uses multiway splits by default, it needs rather large sample sizes to work effectively, since with small sample sizes the respondent groups can quickly become too small for reliable analysis.
CHAID detects interaction between variables in the data set. Using this technique it is possible to establish relationships between a ‘dependent variable’, like sales, and other explanatory variables such as distance, store format, and demographic/lifestyle categories. CHAID does this by identifying discrete groups of respondents and, by taking their responses to explanatory variables, seeks to predict what the impact will be on the dependent variable.
The following example shows the use in predicting the basic radius (in miles) of primary trade areas.
Normal Curves

