Automatic vs Manual for better MPG with R

Executive Summary

Using the mtcars dataset, we explore the relationship between several variables and gas mileage in the form of miles per gallon (MPG). We cannot deliver a solid conclusion on whether transmission type is the causal factor in MPG, as the coefficient sign flipped depending what covariates were included in a linear regression.

If we ignore effects of other variables, however, manual transmission is correlated with a higher MPG value with a 95% confidence interval of 3.2 to 11.3 higher MPG than automatic. Domain knowledge and further analysis shows that this is because automatic vehicles are heavier than manual within our data.

Cursory Evaluation

We load the data and perform a linear regression with MPG as our outcome and transmission type as our only dependent variable.

## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.147 1.125 15.247 1.134e-15
## factor(am)1 7.245 1.764 4.106 2.850e-04
## fit lwr upr
## 1 24.39 21.62 27.17

One can see the intercept coefficient, 17.147, is also the mean MPG value of our first factor level pertaining to automatic transmission. The am1 coefficient is the expected change in MPG for a manual transmission compared to automatic. Since this has a p-value < 0.05, we would reject the hypothesis that there is no difference between the two transmissions, if and only if we ignore the effects of any other variables. In addition, we find that our 95% confidence interval is from 21.6 to 27.2 MPG for manual transmission given this fit.

Expressed in a boxplot, it is clear that manual transmission achieves higher MPG, on average, than automatic.
The regression line from the above fit is drawn across the plot.

Multiple Models

There are 9 other variables beyond MPG and transmission type:

## [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb"

We perform a linear fit using all possible variables:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.30337 18.71788 0.6573 0.51812
## cyl -0.11144 1.04502 -0.1066 0.91609
## disp 0.01334 0.01786 0.7468 0.46349
## hp -0.02148 0.02177 -0.9868 0.33496
## drat 0.78711 1.63537 0.4813 0.63528
## wt -3.71530 1.89441 -1.9612 0.06325
## qsec 0.82104 0.73084 1.1234 0.27394
## vs 0.31776 2.10451 0.1510 0.88142
## am 2.52023 2.05665 1.2254 0.23399
## gear 0.65541 1.49326 0.4389 0.66521
## carb -0.19942 0.82875 -0.2406 0.81218

Initially one can see that manual transmission still has a reasonably large positive effect on MPG, and even more importantly, weight has a large negative effect on MPG, but one cannot infer a statistically significant conclusion from the p-values derived.
There are no glaring outliers when analyzing some residual plots of the fit and correlation of all the pairs (see appendix for figures and analyses).

If, however, we take the two most significant effects upon MPG and perform a regression using them, disregarding interactions, we find the effect of transmission to be negligible compared to that of weight.
On the other hand, if we consider the interaction between the weight and transmission type, we find that transmission type once again makes a very large difference in mileage. This is shown sequentially in the two coefficient sets below:

fit.wt summary (fit.wt)$coef
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.32155 3.0546 12.21799 5.843e-13
## factor(am)1 -0.02362 1.5456 -0.01528 9.879e-01
## wt -5.35281 0.7882 -6.79081 1.867e-07

fit.wt summary (fit.wt)$coef
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 31.416 3.0201 10.402 4.001e-11
## factor(am)1 14.878 4.2640 3.489 1.621e-03
## wt -3.786 0.7856 -4.819 4.551e-05
## factor(am)1:wt -5.298 1.4447 -3.667 1.017e-03

Results

It is difficult to say whether transmission type is a causal factor in MPG versus simply correlated, as the results given multiple linear regression models have expressed a sign flip in the relevant coefficient. One can say that considering simply the non-causational relationship between transmission and MPG, manual results in better mileage to a statistically significant degree (p-val < 0.05). When considered along with other variables, it is ambiguous whether weight or transmission type is the true cause, or an interaction between the two as seen in the last model. Inferring from our knowledge of the domain at hand, it is more likely that automatic cars are generally heavier, and manual cars lighter, leading to a correlation but not direct causation between themselves and MPG.

Appendix

Residual analysis and figures: We analyze the relative influence of all the variables and find none exceedingly out of place, with a maximum of about 1.5.
round ( dfbetas (fit.all),3)
## (Intercept) cyl disp hp drat wt qsec
## Mazda RX4 -0.080 0.006 -0.097 0.260 0.035 0.109 -0.016
## Mazda RX4 Wag 0.005 -0.032 -0.010 0.141 0.003 0.014 -0.047
## Datsun 710 -0.248 0.138 0.299 -0.305 0.187 -0.386 0.273
## Hornet 4 Drive 0.006 0.001 0.016 -0.011 -0.015 -0.013 0.000
## Hornet Sportabout -0.015 0.022 0.102 -0.041 0.000 -0.120 0.043
## Valiant -0.014 -0.132 0.050 0.017 0.393 -0.031 -0.055
## Duster 360 -0.012 0.013 -0.008 -0.007 0.002 0.014 0.002
## Merc 240D 0.181 -0.241 0.061 -0.175 -0.087 0.082 -0.184
## Merc 230 1.262 0.024 -0.262 -0.528 -0.271 0.569 -1.595
## Merc 280 -0.002 0.071 -0.046 -0.067 0.061 0.060 -0.080
## Merc 280C 0.110 -0.220 0.077 0.156 -0.162 -0.083 0.068
## Merc 450SE -0.010 0.223 -0.567 0.243 0.043 0.435 -0.129
## Merc 450SL -0.057 0.136 -0.210 0.117 -0.003 0.070 0.056
## Merc 450SLC 0.040 -0.053 0.068 -0.044 -0.002 -0.018 -0.042
## Cadillac Fleetwood 0.019 0.157 -0.353 0.234 0.101 0.075 -0.103
## Lincoln Continental -0.001 0.040 -0.036 0.025 0.007 -0.045 0.004
## Chrysler Imperial 0.043 -0.306 -0.195 0.232 0.321 0.775 -0.284
## Fiat 128 -0.211 0.251 -0.305 0.165 0.077 0.309 0.116
## Honda Civic -0.040 0.034 0.116 -0.107 0.154 -0.111 0.033
## Toyota Corolla -0.559 0.351 0.102 0.112 0.198 -0.310 0.668
## Toyota Corona -0.423 0.594 0.058 -0.443 0.014 0.171 0.080
## Dodge Challenger -0.086 -0.061 -0.011 0.108 0.149 0.036 0.062
## AMC Javelin 0.083 -0.239 0.030 0.102 -0.033 0.069 -0.047
## Camaro Z28 -0.001 0.001 0.000 -0.001 -0.001 0.000 0.001
## Pontiac Firebird 0.001 -0.002 0.326 -0.195 -0.030 -0.224 0.034
## Fiat X1-9 -0.005 -0.019 0.013 -0.004 0.002 -0.010 0.004
## Porsche 914-2 -0.018 0.043 0.013 -0.009 -0.017 -0.025 0.024
## Lotus Europa 0.510 -0.312 0.260 -0.049 -0.660 -0.256 -0.400
## Ford Pantera L 1.503 -1.325 -0.008 -0.534 -1.263 -0.291 -0.196
## Ferrari Dino 0.001 0.000 0.000 0.000 -0.001 0.000 0.000
## Maserati Bora -0.118 -0.026 0.044 0.308 -0.221 -0.210 0.246
## Volvo 142E -0.210 0.187 0.342 -0.300 -0.075 -0.519 0.314
## vs am gear carb
## Mazda RX4 0.099 -0.140 0.167 -0.259
## Mazda RX4 Wag 0.091 -0.129 0.087 -0.126
## Datsun 710 -0.191 -0.371 0.103 0.407
## Hornet 4 Drive 0.021 0.003 -0.002 0.008
## Hornet Sportabout -0.029 -0.041 0.013 0.021
## Valiant -0.242 -0.165 -0.076 0.057
## Duster 360 0.002 0.008 0.014 -0.010
## Merc 240D -0.016 -0.371 0.220 0.038
## Merc 230 0.938 0.403 -0.561 -0.171
## Merc 280 0.115 -0.083 0.047 0.011
## Merc 280C -0.245 0.176 -0.133 -0.064
## Merc 450SE -0.120 -0.008 0.037 -0.387
## Merc 450SL -0.110 -0.012 -0.007 -0.108
## Merc 450SLC 0.046 -0.005 -0.002 0.034
## Cadillac Fleetwood -0.021 -0.171 0.051 -0.244
## Lincoln Continental -0.005 -0.046 0.011 -0.013
## Chrysler Imperial 0.002 0.233 -0.166 -0.321
## Fiat 128 0.115 0.572 -0.068 -0.349
## Honda Civic 0.058 0.031 -0.109 0.121
## Toyota Corolla -0.047 0.563 -0.160 0.029
## Toyota Corona 0.230 0.370 0.474 -0.070
## Dodge Challenger 0.005 0.037 -0.064 0.026
## AMC Javelin 0.084 0.078 -0.095 0.090
## Camaro Z28 0.000 0.001 0.001 0.000
## Pontiac Firebird -0.014 -0.053 0.076 0.044
## Fiat X1-9 -0.039 -0.081 0.033 0.026
## Porsche 914-2 0.074 0.048 -0.046 0.044
## Lotus Europa 0.345 -0.250 0.423 -0.033
## Ford Pantera L -0.457 -0.094 -1.659 1.395
## Ferrari Dino 0.000 0.000 0.000 0.001
## Maserati Bora 0.080 0.183 -0.029 0.262
## Volvo 142E -0.216 -0.372 0.261 0.306
par (mfrow= c (2,2))
plot (fit.all)

Leave a Reply

Your email address will not be published. Required fields are marked *