As a chemical engineer, I believe that the price of crude oil impacts my job market. Since the second quarter of 2014, the crude price of oil has dipped significantly. There are numerous factors that can explain the downward trend. Recently, New York Times reported that the problem was record high production from North America, South American and the OPEC (Organization of the Petroleum Exporting Countries) (Krauss, 2016).
Over the next several weeks, I’ll be looking at the features available in R to forecast commodity prices. I am doing this for three reasons: (1) To better understand the economics behind my field, (2) to build a strong portfolio as a data engineer and (3) to make a buck or two by hopefully investing some of my own money in the future.
A really useful package is the Quandl package. Quandl publishes financial data for the public to download freely.
Another package is STL. Developed in the 1990s, STL specializes in separating seasonal trends from a model.
Load the packages:
Get data from Quandl and create a summary plot:
oil <- Quandl(“OPEC/ORB”, trim_start=”01-01-2000″, trim_end=”10-01-2016″, type=”xts”)
plot(oil,xlab=”Date”,ylab=”Cost per Barrel ($USD)”, main=”OPEC Basket Price”)
Ideally, I would prefer the WTI (West Texas Intermediate) Oil Price because its model covers sweet and light oil – which industries prefer. However Quandl offers a more complete dataset with the OPEC basket price.
Oil production fluctuates throughout the year – possibly due to weather, holidays and consumer behaviour. Assuming a cyclic pattern occurring every year (365 days), I isolated the seasonal variation.
attr(oil, ‘frequency’) <- 365
Because this is an additive decomposition, simply adding the random component to the seasonal and trend component will equal to the original dataset. The important information is the overall trend – without the influence of seasons. We can forecast this decomposed trend.
fit <- stl(oil, t.window=1, s.window=”periodic”, robust=TRUE)
fcast <- forecast(fit, method=”ets”)
plot(fcast, ylab=”New orders index”)
Forecasting the decomposed trend and later adding the seasonal variation generates the above model. At this point, there is still much improvement to be done. There is still too much variability within the confidence limits.