Today’s project focuses on creating a linear model that would describe the influence of multiple ingredients on concrete’s ability to withstand loads. The linear model was built with R.
Data comes from Chung-Hua University, China. Input variables measured were cement, slag, fly ash, water, super plasticizer(SP), coarse aggregate and fine aggregates. Input variables were measured in kg/m3 of concrete. The output variable is compressive strength after 28 days, measured in MPa. Results show that water is the strongest influencer of compressive strength. Slag is the weakest influencer of compressive strength. Super plasticizer had little to no impact and was completely removed from the model. The compressive strength was determined to follow below equation:
Compressive strength
= 0.04970*(Cement) – 0.04519*(Slag) + 0.03859*(Fly ash) – 0.27055*(Water) – 0.06986*(Coarse Aggregate) – 0.05358*(Fine Aggregate)
Normalized Histogram of Residuals |
The correlation coefficient shows a strong fit (R2 = 0.8962) and the probability values are low for each variable. The normalized histogram shows a normal distribution of residuals. The distribution of residuals strongly support the linear model and removes the risk of systematic error.
The problem was approached by creating a multivariable linear regression of all the input variables:
Initial Regression |
A high correlation coefficient exists. Some of the probability values, however, do not show strong evidence against the null hypothesis – notably slag, fine aggregates and SP. Fortunately, the step() function only selects feasible variables.
Final Regression |
The coefficients are listed in the column. The full coding are as follows:
The links to the code, csv file and original dataset are attached. If you have any ideas for improvement or would like to get in contact, please comment or email me directly at matthewm3109@gmail.com.
Link to code & csv: http://bit.ly/1QRzyjr
Link to original data: https://archive.ics.uci.edu/ml/datasets/Concrete+Slump+Test