Skip to content

Utilizing R created and analyzed datasets to review and validate their statistical similarities

Notifications You must be signed in to change notification settings

whartzler/MechaCar_Statistical_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MechaCar_Statistical_Analysis

Overview

We are working with AutosRU's to review their newest prototype MechaCar. They have asked us to review the production data for insights that may help the manufacturing team.

Resources

MechaCarChallenge.R - R Script that reads and analyzes the datasets Suspension_Coil - data set on MechaCar's Suspension data MechaCar_mpg - data set on MechaCar and metrics

Linear Regression to Predict MPG

  • The variables/coefficients that provided a non random amount of variance to the mpg values in the dataset are intercept, vehicle length & ground clearance as they have the smallest Pr(>|t|) values. The smaller the Pr(>|t|) value the less probability that each coefficient contributes a random amount of variance to the linear model.

image

  • The slope of the linear model is not 0. With the significance of our p-value being much smaller than our assumed significant level of 0.05% the slope of the linear model is not zero and we can reject the null hypothesis.

image

  • The linear model does predict the mpg of the MechaCar prototypes effectively. Utilizing the R-Squared value we can identify how well our model predicts the mpg of the MechaCar prototyps. Based on the calculation the R-Squared value = 0.71 which means appoximately 71% of all mpg predictions will be correct when using this model.

image

Summary Statistics on Suspension Coils

  • The design specifications for the MechaCar suspension coils dictate that the variance of the suspension coils must not exceed 100 pounds per square inch. This current current manufacturing data does not meet the design specification only for Lot 3. Lot 1 & Lot 2 meet this criteria. In the below snippet we calculated the variance for each lot to determine the variance or degree of spead from the mean in the data set. Lot 1 & Lot 2 meet this criteria as their variance is lower than 100. Lot 3 fails this criteris as its variance is 170.29 well above the 100 pounds per square inch allowable variance.

image

T-Tests on Suspension Coils

Based on the below screenshots of each t.test for each individual lot and all lots combined each had varying results. When we reviewed all lots combined a p-value of 0.06 was calculated. Based on this we can conclude that there is no significant difference and we can reject our null hypothesis. Lots 1 & 2 also meet this signicance level with Lot 1 having the best p value. Lot 3 does not meet our criteria with a P-value = 0.042 and the two means are significantly different.

All Manufacturing Lots

image

Lot 1

image

Lot 2

image

Lot 3

image

Study Design: MechaCar vs Competition

In a study to compare MechaCar vs Competition testing city and highway fuel efficiency, maintenance cost, size of vehicle and drive train system would be great metrics to test. In this test there would be multiple hyposthsis to test. The primary alternative hypothesis would be MechaCar city and highway fuel efficiency is better than the competitors. Another alternative hypothesis is maintenance costs higher/lower based on the size and drive train of the vehicles.

We would need to use a mix of statistical tests. When we compare fuel efficiencies we would use two, two sample tests. The first would test city fuel efficiency and the second test would analyze highway fuel efficiency. This would allow us to see if the MechaCar has better fuel efficiency in one specific area or both.
For the second hypothesis to test maintenance costs we would use ANOVA tests as we have multiple independant variables, size and drive train of the vehicles.
The data types we would need for the statistical tests would be a mix of continuous, dichotomous and categorical based on the tests we run. Preferably on top of the data for MechaCar at least 1 other manufacturer to compare data to, but if we can test more manufactures we can see how well MechaCar compares.

About

Utilizing R created and analyzed datasets to review and validate their statistical similarities

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages