Looking at Oil with R (Part 1)

As a chemical engineer, I believe that the price of crude oil impacts my job market. Since the second quarter of 2014, the crude price of oil has dipped significantly. There are numerous factors that can explain the downward trend. Recently, New York Times reported that the problem was record high production from North America, South American and the OPEC (Organization of the Petroleum Exporting Countries) (Krauss, 2016).

Over the next several weeks, I’ll be looking at the features available in R to forecast commodity prices. I am doing this for three reasons: (1) To better understand the economics behind my field, (2) to build a strong portfolio as a data engineer and (3) to make a buck or two by hopefully investing some of my own money in the future.

A really useful package is the Quandl package. Quandl publishes financial data for the public to download freely.


Another package is STL. Developed in the 1990s, STL specializes in separating seasonal trends from a model.


Load the packages:


Get data from Quandl and create a summary plot:

oil <- Quandl(“OPEC/ORB”, trim_start=”01-01-2000″, trim_end=”10-01-2016″, type=”xts”)
plot(oil,xlab=”Date”,ylab=”Cost per Barrel ($USD)”, main=”OPEC Basket Price”)

Ideally, I would prefer the WTI (West Texas Intermediate) Oil Price because its model covers sweet and light oil – which industries prefer.  However Quandl offers a more complete dataset with the OPEC basket price.

Oil production fluctuates throughout the year – possibly due to weather, holidays and consumer behaviour. Assuming a cyclic pattern occurring every year (365 days), I isolated the seasonal variation.

attr(oil, ‘frequency’) <- 365


Because this is an additive decomposition, simply adding the random component to the seasonal and trend component will equal to the original dataset. The important information is the overall trend – without the influence of seasons. We can forecast this decomposed trend.

fit <- stl(oil, t.window=1, s.window=”periodic”, robust=TRUE)
fcast <- forecast(fit, method=”ets”)
plot(fcast, ylab=”New orders index”)

Forecasting the decomposed trend and later adding the seasonal variation generates the above model. At this point, there is still much improvement to be done. There is still too much variability within the confidence limits.

<!–[if supportFields]> BIBLIOGRAPHY <![endif]–>Krauss, C. (2016, November 2). Oil Prices: What’s Behind the Volatility? Simple Economics. Retrieved November 6, 2016, from New York Times : http://www.nytimes.com/interactive/2016/business/energy-environment/oil-prices.html

Shoutout to the Reddit community for their help

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s