Regression / Time Series BP Stock Model

The date of the oil spill in the Gulf of Mexico was April 20, 2010.  British Petroleum stock has declined 54.6% from the close of 4/20/2010 to close 6/25/10 which is 48 trading days.  Volume has gone from 3.8 million on April 20th to 93 million on June 25th with a max of 240 million on June 9th. I am going to fit a simple regression and time series model to predict BP stock price based solely on price and volume. 

Below is an ANOVA table of a regression between change in price as a dependent variable and change in volume as an independent variable.  The negative sign for change in volume indicates as volume increases, BP stock has declined over this time period and is statistically significant.


  













The residuals are auto-correlated as this is a time series.  I am going to fit an AR(2) model to the residuals of the regression.  An AR(2) turned out to be the best fit after examining various ARIMA models and GARCH / ARCH models. 





To examine the fit of the model, an estimator of variance needs to be created using the actual stock prices and the fitted values.  Mean squared error or MSE is calculated below:

For this model, MSE = 2.8.


For only looking at price changes and volume, this is a fairly accurate fit and predictor although there are a few trouble points.  Accuracy could be increased by adding to the model.  

Disclaimer:  Please note this is a demonstration and for academic use only. 

Benefits and Uses of Statistical Research

Identify Risk or Opportunity
Statistical research and data mining models can be used to identify both specific risk or opportunity to a company. Credit card companies use data mining models to identify possible fraudulent transactions or the probability of a consumer to default on a loan or miss a payment. Statistical research can also be used to identify high quality consumers that will minimize possible borrowing risks and maximize earnings.

Market Segmentation
Statistical research can be used to identify high quality consumers who are profitable to retain and low quality consumers who are not profitable to retain. High quality consumers who are at risk of canceling a service might respond to a particular marketing strategy with a higher probability rather than an alternative strategy. Statistical research and predictive modeling techniques can be used to maximize the retention of quality consumers.

Cross Selling
Companies today collect vast data concerning their customers including demographics and purchasing habits and behavior.  Statistical research and data mining models can be used to identify current consumers that are likely to purchase additional products from organizations that maintain and collect elaborate data.

Identify New Markets
Statistical research can be used to identify new markets and opportunities. A properly designed survey and sampling techniques can be used to identify consumers that are likely to purchase a brand new product and whether it would be profitable for a company to bring it to market.

Minimize Variability of a Process
Many times a company may be more concerned about the variability of a response around its mean rather than the actual mean response. Statistical research can be used to ensure homogeneity of product rather than products that are manufactured with different tolerance levels.  As an example, a company manufacturing semiconductors that need to fit into another company's motherboard would want to minimize variability in dimensions and thickness to ensure the products fit and are compatible rather than focus on the mean of product dimensions. 

Efficacy of a Process or Product using Design of Experiments
Proper experimental design including randomization, replication, and blocking (if necessary) can determine if a drug, diet, exercise program, etc. is effective versus another.  Choosing the correct design before the experiment and appropriate factors and interactions to investigate is critical.  Types of designs include completely randomized designs (CRD), CRD with blocking, split plot designs, full and partial factorial designs, etc.