| # Model Card | |
| ## Model Card Authors | |
| Mathew | |
| ## Model Description | |
| This is a Linear Regression model trained on the UCI Automobile dataset to predict the 'symboling' insurance risk rating from 17 car features including price, horsepower, bore, and curb-weight, amongst other continous variables. Symboling is defined as an integer value (whole number), ranging from -3 to +3. | |
| ## Intended Uses & Limitations | |
| This model is for educational purposes only. It is not suitable for production use because the dataset is small (only 200 or so entries), outdated (1980s), and contained a lot of missing values (41 missing normalized-losses, around 20% of all rows had a missing normalized-losses entry). While the missing data was imputated, predictions should not be used for real insurance predictions. | |
| ## Training Data | |
| Data source: UCI Automobile dataset (https://archive.ics.uci.edu/dataset/10/automobile). Contains ~200 cars with mixed numeric and categorical features. Missing values were imputed using MICE. | |
| ## Evaluation Metrics | |
| - R2: 0.603 | |
| - RMSE: 0.713 | |
| ## Ethical Considerations | |
| The 'symboling' risk value is not only determined by continous, but categorical variables as well, which the model does not account for. While things such as horsepower, bore, engine-size, and number of doors are good predictors, insurance companies also use brands of cars and the type of car (luxury, sport, etc), as well as a variety of other variables to help determine risk factors. Because the model does not take these variables into account, it is very unreliable. | |
| ## Audit Questions | |
| - What features most strongly influence predictions? | |
| - Are residuals randomly scattered or patterned? | |
| - How reliable are the evaluation metrics? | |
| ## Coefficients | |
| | features | coefficients | | |
| |:------------------|---------------:| | |
| | price | -1.73704e-05 | | |
| | highway-mpg | 0.0438076 | | |
| | city-mpg | -0.0610687 | | |
| | peak-rpm | -5.49499e-05 | | |
| | horsepower | 0.00207246 | | |
| | compression-ratio | 0.0187334 | | |
| | stroke | -0.555667 | | |
| | bore | -0.827261 | | |
| | engine-size | 0.013724 | | |
| | num-of-cylinders | -0.498651 | | |
| | curb-weight | -5.04019e-05 | | |
| | height | 0.0239754 | | |
| | width | 0.195005 | | |
| | length | 0.0120506 | | |
| | wheel-base | -0.153431 | | |
| | num-of-doors | -0.428882 | | |
| | normalized-losses | 0.0116676 | | |
| ## Plots | |
| ### Predicted vs Actual | |
|  | |
| ### Residuals Plot | |
|  | |