# Automatic vs Manual for better MPG with R

## Executive Summary

Using the mtcars dataset, we explore the relationship between several variables and gas mileage in the form of miles per gallon (MPG). We cannot deliver a solid conclusion on whether transmission type is the causal factor in MPG, as the coeﬃcient sign ﬂipped depending what covariates were included in a linear regression.

If we ignore eﬀects of other variables, however, manual transmission is correlated with a higher MPG value with a 95% conﬁdence interval of 3.2 to 11.3 higher MPG than automatic. Domain knowledge and further analysis shows that this is because automatic vehicles are heavier than manual within our data.

## Cursory Evaluation

We load the data and perform a linear regression with MPG as our outcome and transmission type as our only dependent variable.

```## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 17.147 1.125 15.247 1.134e-15 ## factor(am)1 7.245 1.764 4.106 2.850e-04 ## fit lwr upr ## 1 24.39 21.62 27.17```

One can see the intercept coeﬃcient, 17.147, is also the mean MPG value of our ﬁrst factor level pertaining to automatic transmission. The am1 coeﬃcient is the expected change in MPG for a manual transmission compared to automatic. Since this has a p-value < 0.05, we would reject the hypothesis that there is no diﬀerence between the two transmissions, if and only if we ignore the eﬀects of any other variables. In addition, we ﬁnd that our 95% conﬁdence interval is from 21.6 to 27.2 MPG for manual transmission given this ﬁt.

Expressed in a boxplot, it is clear that manual transmission achieves higher MPG, on average, than automatic.
The regression line from the above ﬁt is drawn across the plot.

## Multiple Models

There are 9 other variables beyond MPG and transmission type:

`## [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb"`

We perform a linear ﬁt using all possible variables:
```## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 12.30337 18.71788 0.6573 0.51812 ## cyl -0.11144 1.04502 -0.1066 0.91609 ## disp 0.01334 0.01786 0.7468 0.46349 ## hp -0.02148 0.02177 -0.9868 0.33496 ## drat 0.78711 1.63537 0.4813 0.63528 ## wt -3.71530 1.89441 -1.9612 0.06325 ## qsec 0.82104 0.73084 1.1234 0.27394 ## vs 0.31776 2.10451 0.1510 0.88142 ## am 2.52023 2.05665 1.2254 0.23399 ## gear 0.65541 1.49326 0.4389 0.66521 ## carb -0.19942 0.82875 -0.2406 0.81218```

Initially one can see that manual transmission still has a reasonably large positive eﬀect on MPG, and even more importantly, weight has a large negative eﬀect on MPG, but one cannot infer a statistically signiﬁcant conclusion from the p-values derived.
There are no glaring outliers when analyzing some residual plots of the ﬁt and correlation of all the pairs (see appendix for ﬁgures and analyses).

If, however, we take the two most signiﬁcant eﬀects upon MPG and perform a regression using them, disregarding interactions, we ﬁnd the eﬀect of transmission to be negligible compared to that of weight.
On the other hand, if we consider the interaction between the weight and transmission type, we ﬁnd that transmission type once again makes a very large diﬀerence in mileage. This is shown sequentially in the two coeﬃcient sets below:

```fit.wt summary (fit.wt)\$coef ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 37.32155 3.0546 12.21799 5.843e-13 ## factor(am)1 -0.02362 1.5456 -0.01528 9.879e-01 ## wt -5.35281 0.7882 -6.79081 1.867e-07```

```fit.wt summary (fit.wt)\$coef ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 31.416 3.0201 10.402 4.001e-11 ## factor(am)1 14.878 4.2640 3.489 1.621e-03 ## wt -3.786 0.7856 -4.819 4.551e-05 ## factor(am)1:wt -5.298 1.4447 -3.667 1.017e-03```

## Results

It is difficult to say whether transmission type is a causal factor in MPG versus simply correlated, as the results given multiple linear regression models have expressed a sign ﬂip in the relevant coefficient. One can say that considering simply the non-causational relationship between transmission and MPG, manual results in better mileage to a statistically signiﬁcant degree (p-val < 0.05). When considered along with other variables, it is ambiguous whether weight or transmission type is the true cause, or an interaction between the two as seen in the last model. Inferring from our knowledge of the domain at hand, it is more likely that automatic cars are generally heavier, and manual cars lighter, leading to a correlation but not direct causation between themselves and MPG.

## Appendix

Residual analysis and ﬁgures: We analyze the relative inﬂuence of all the variables and ﬁnd none exceedingly out of place, with a maximum of about 1.5.
```round ( dfbetas (fit.all),3) ## (Intercept) cyl disp hp drat wt qsec ## Mazda RX4 -0.080 0.006 -0.097 0.260 0.035 0.109 -0.016 ## Mazda RX4 Wag 0.005 -0.032 -0.010 0.141 0.003 0.014 -0.047 ## Datsun 710 -0.248 0.138 0.299 -0.305 0.187 -0.386 0.273 ## Hornet 4 Drive 0.006 0.001 0.016 -0.011 -0.015 -0.013 0.000 ## Hornet Sportabout -0.015 0.022 0.102 -0.041 0.000 -0.120 0.043 ## Valiant -0.014 -0.132 0.050 0.017 0.393 -0.031 -0.055 ## Duster 360 -0.012 0.013 -0.008 -0.007 0.002 0.014 0.002 ## Merc 240D 0.181 -0.241 0.061 -0.175 -0.087 0.082 -0.184 ## Merc 230 1.262 0.024 -0.262 -0.528 -0.271 0.569 -1.595 ## Merc 280 -0.002 0.071 -0.046 -0.067 0.061 0.060 -0.080 ## Merc 280C 0.110 -0.220 0.077 0.156 -0.162 -0.083 0.068 ## Merc 450SE -0.010 0.223 -0.567 0.243 0.043 0.435 -0.129 ## Merc 450SL -0.057 0.136 -0.210 0.117 -0.003 0.070 0.056 ## Merc 450SLC 0.040 -0.053 0.068 -0.044 -0.002 -0.018 -0.042 ## Cadillac Fleetwood 0.019 0.157 -0.353 0.234 0.101 0.075 -0.103 ## Lincoln Continental -0.001 0.040 -0.036 0.025 0.007 -0.045 0.004 ## Chrysler Imperial 0.043 -0.306 -0.195 0.232 0.321 0.775 -0.284 ## Fiat 128 -0.211 0.251 -0.305 0.165 0.077 0.309 0.116 ## Honda Civic -0.040 0.034 0.116 -0.107 0.154 -0.111 0.033 ## Toyota Corolla -0.559 0.351 0.102 0.112 0.198 -0.310 0.668 ## Toyota Corona -0.423 0.594 0.058 -0.443 0.014 0.171 0.080 ## Dodge Challenger -0.086 -0.061 -0.011 0.108 0.149 0.036 0.062 ## AMC Javelin 0.083 -0.239 0.030 0.102 -0.033 0.069 -0.047 ## Camaro Z28 -0.001 0.001 0.000 -0.001 -0.001 0.000 0.001 ## Pontiac Firebird 0.001 -0.002 0.326 -0.195 -0.030 -0.224 0.034 ## Fiat X1-9 -0.005 -0.019 0.013 -0.004 0.002 -0.010 0.004 ## Porsche 914-2 -0.018 0.043 0.013 -0.009 -0.017 -0.025 0.024 ## Lotus Europa 0.510 -0.312 0.260 -0.049 -0.660 -0.256 -0.400 ## Ford Pantera L 1.503 -1.325 -0.008 -0.534 -1.263 -0.291 -0.196 ## Ferrari Dino 0.001 0.000 0.000 0.000 -0.001 0.000 0.000 ## Maserati Bora -0.118 -0.026 0.044 0.308 -0.221 -0.210 0.246 ## Volvo 142E -0.210 0.187 0.342 -0.300 -0.075 -0.519 0.314 ## vs am gear carb ## Mazda RX4 0.099 -0.140 0.167 -0.259 ## Mazda RX4 Wag 0.091 -0.129 0.087 -0.126 ## Datsun 710 -0.191 -0.371 0.103 0.407 ## Hornet 4 Drive 0.021 0.003 -0.002 0.008 ## Hornet Sportabout -0.029 -0.041 0.013 0.021 ## Valiant -0.242 -0.165 -0.076 0.057 ## Duster 360 0.002 0.008 0.014 -0.010 ## Merc 240D -0.016 -0.371 0.220 0.038 ## Merc 230 0.938 0.403 -0.561 -0.171 ## Merc 280 0.115 -0.083 0.047 0.011 ## Merc 280C -0.245 0.176 -0.133 -0.064 ## Merc 450SE -0.120 -0.008 0.037 -0.387 ## Merc 450SL -0.110 -0.012 -0.007 -0.108 ## Merc 450SLC 0.046 -0.005 -0.002 0.034 ## Cadillac Fleetwood -0.021 -0.171 0.051 -0.244 ## Lincoln Continental -0.005 -0.046 0.011 -0.013 ## Chrysler Imperial 0.002 0.233 -0.166 -0.321 ## Fiat 128 0.115 0.572 -0.068 -0.349 ## Honda Civic 0.058 0.031 -0.109 0.121 ## Toyota Corolla -0.047 0.563 -0.160 0.029 ## Toyota Corona 0.230 0.370 0.474 -0.070 ## Dodge Challenger 0.005 0.037 -0.064 0.026 ## AMC Javelin 0.084 0.078 -0.095 0.090 ## Camaro Z28 0.000 0.001 0.001 0.000 ## Pontiac Firebird -0.014 -0.053 0.076 0.044 ## Fiat X1-9 -0.039 -0.081 0.033 0.026 ## Porsche 914-2 0.074 0.048 -0.046 0.044 ## Lotus Europa 0.345 -0.250 0.423 -0.033 ## Ford Pantera L -0.457 -0.094 -1.659 1.395 ## Ferrari Dino 0.000 0.000 0.000 0.001 ## Maserati Bora 0.080 0.183 -0.029 0.262 ## Volvo 142E -0.216 -0.372 0.261 0.306 par (mfrow= c (2,2)) plot (fit.all)```