Oil production and prices in US have been experiencing fluctuation for a long period and there is a need to identify whether the production has a significant effect on oil prices for the period from 1986 to 2005. This analysis could be of interest to the energy administration commission and the government for the purpose of planning for oil price control as well as for the purpose of planning investment in infrastructure for oil production. (Bp, 2013)
This will involve testing whether there is a relationship between the variables as well as establishing the nature of such relationship. To solve the problem, the analysis applies an OLS method in estimating the relationship and identifying its meaning for oil pricing. Thus the analysis will test the hypothesis that US oil prices significantly relates to US production quantity. (Chiang & Wainright, 2005)
Null hypotheses (Ho): b1≠ 0.
Alternative hypothesis (H1): b1= 0.
Where: b1 is the correlation between US oil price and the produced quantity. (Makridakis & Wheelwright, 2008)
- Data description
- Data collection
Data collection for the study has been done from secondary sources where US crude oil production and price data has been obtained from published reports including government reports on the energy sector. Information about other studies and market views about the prices and production have also been collected from published reports including the industry’s reports. (Lacobucci & Churchill, 2010)
- Data type
The data used in this analysis takes the form of time series which is a sequence of observations of a variable that is measured at successive time periods over a given time. The measures for this analysis have been taken for annual periods with crude oil prices and production per year being taken into consideration. This will help in identifying the pattern and relationship between oil prices and production over the study period between 1986 and 2005. (Lacobucci & Churchill, 2010)
The variables under study are the US oil production quantity and crude oil prices where oil price is the dependent variable while the independent variable is the production quantity. Thus, the analysis will be seeking to identify the effect the change in oil production has over oil prices over the years. (Makridakis & Wheelwright, 2008)
- Estimation technique and its properties.
The estimation technique applied for this analysis is the ordinary least squares (OLS) method which estimates a regression model by minimizing the sum of squares’ residuals. Some of its properties include
- Sum of residuals equals zero.
- The OLS line of fit passes through the mean of the sample. (Clements & Hendry, 2011)
- Software used to fit the model
Excel software has been used for this analysis for the purpose of computing the regression model parameters for the first sample data collected for the period between 1986 and 1995 as well as forecasting data for the second sample of oil prices and production for the period between 1996 to 2005.
With the two samples of data collected for oil prices and production, the regression model and its parameters are estimated from the first sample and tested by forecasting oil prices for the second sample as follows.
Data Source: (U.S. eia, 2014a, 2014b)
- Trend and seasonality
A trend exists when a time series shows gradual movement or shifts to relatively lower or higher values over the period in consideration. This occurs as a result of change in long-term factors which may include production quantity, inflation, demand and production cost. In that respect, the oil prices have shown an upward trend as indicated on the price chart. (Makridakis & Wheelwright, 2008)
On the other hand, seasonality is normally experienced when there is an analysis of multiyear historical data hence its occurrence during successive time periods. However, seasonality does not feature on this case since no successive periods were considered in the analysis. (Clements & Hendry, 2011)
- Differences between sub-samples
With data under analysis ranging from 1986 to 2005, the period from 1986 to 1995 can be described as a subsample with the rest being a different another sample. In that respect, the forecasting has indicated that the regression models developed by use of the first sample is not a good estimator of the second sample hence the two samples being different. (Stock & Watson, 2003)
- Forecast evaluation study
With a P-value of 0.427 which gives the probability that there is a correlation between the dependent and independent variables, it can be concluded that there lacks evidence against null hypothesis hence the data is consistent with the null hypothesis indicating existence of correlation between US oil prices and quantity produced in US. (Makridakis & Wheelwright, 2008)
Correlation coefficient measures the nature of relationship between variables with its value ranging from 0 to 1 where a value close to zero indicates a weak correlation between the variables while a value close to one shows a strong relationship between the variables. In that respect, the correlation coefficient of -0.28 for this regression is low and close to zero hence an indication of a weak negative relationship between oil production quantity and oil prices in the US, Thus, change in US oil production results to a slight change in oil prices in the opposite direction. (Clements & Hendry, 2011)
Covariance measures the nature of relationship between the variables being studied seeking to establish whether there is a linear relationship between the variables with a value of 0 indicating lack of dependence between the variables. In that respect, the covariance value of -181 for this analysis indicates a level of dependence between the oil production quantity and oil price in US. (Montgomery et al, 2001)
T-test assessment seeks to determine whether two data’s have a statistical difference from each other. In that respect, this test has been applied to determine whether the data collected for oil production and prices are different. The t-value is usually positive if the mean of the first data is greater than the mean of the second data. With the t- value, the significance value is then checked to determine whether the difference in the two data’s is by chance. This requires setting an alpha level of 0.5 which sets the level of risk that is expected. The 0.5 alpha has the meaning that 5 times in a 100, there will be statistical differences in the groups’ mean even if it was by chance. The degree of freedom is also necessary which is n-2 where n is the total elements per group. In that respect, the alpha value for this test being 0.5 and the degree of freedom being (10-2) = 8. (Stock & Watson, 2003)
- t for slope = -0.84
- t for intercept = 2.84
The t value is smaller if the difference between the means is relatively smaller to the standard deviation and a smaller value is an indication of a high probability that the populations used are different. In addition, the value is interpreted based on the absolute value and regardless of the sign. In that respect, the model’s small t-values is an indication of a high probability that the populations used are different. (Montgomery et al, 2001)
R2 – Test
R2 = [Explained Variation / Total variation] The test measures how the used data is close to the regression line of fit and the R2 is referred to as coefficient of determination. The value is easy to estimate and gives the percentage of the dependent variables variation that can be explained by the estimated model. The value ranges from zero to 100% with zero being an indication that the estimated model does not explain the dependent variable’s variation while a value of 100% is an indication that all the dependent variables variation is explained by the model. The model’s R2 equals one 0.08 which means that 8% of the oil price is explained by the estimated function. (Stock & Watson, 2003)
In light of the tests and the forecast assessment done on the second sample using the model, it can be concluded that; although there is correlation between oil price in US and quantity of oil produced in the US, the estimated regression model is inadequate for use in forecasting the oil prices. In that respect, the model cannot be expected to provide reliable forecasts that can be relied upon by the policy makers. (Clements & Hendry, 2011)
The analysis has demonstrated that the quantity of oil produced in the US has had an effect on the US oil price over the period under study hence acceptance of the Null hypothesis. However, the effect is not significant enough hence an indication that there are other factors that affects oil prices. In addition, the estimated model’s inability to estimate the second sample’s oil prices is an indication of its lack of accuracy hence cannot be relied upon in forecasting the oil price in absence of considering other factors. In that respect, it is necessary to include more factors in the model to improve its accuracy. (Makridakis & Wheelwright, 2008)
BP. (2013). Bp Energy outlook 2030. Retrieved from,
Chiang, A. & Wainright, K. (2005). Fundamental mathematics for economics.
Clements, M. & Hendry, D. (2011). The Oxford Handbook of Economic Forecasting. Oxford:
Oxford University Press.
Lacobucci, D. & Churchill, G. (2010). (10th Ed.). Marketing Research: Methodical
Foundations. Ohio: South-Western College Publishers.
Makridakis, S. & Wheelwright, S. (2008). (3rd Ed.). Forecasting Methods and applications.
New Delhi: Wiley India Pvt. Ltd.
Montgomery, D., Peck, E. & Vining, G. (2001). (3rd Ed.). Introduction to Linear Regression
Analysis. New York: John Wiley & Sons.
Stock, J. & Watson, M. (2003). Introduction to Econometrics. Boston: Wesley.
U.S. Energy Information Administration. (2014a). Petroleum and other Liquids: US Field
Production of Crude Oil. Retrieved from, http://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=pet&s=mcrfpus1&f=a
U.S. Energy Information Administration. (2014b). Retrieved from,