CSC310-fall25
/

boullier_assignment5_regressionmodel

Model card Files Files and versions

xet

Community

mboullier commited on Nov 7, 2025

Commit

c4485cf

verified ·

1 Parent(s): 85a4d68

Upload folder using huggingface_hub

Browse files

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ Mathew
 This is a Linear Regression model trained on the UCI Automobile dataset to predict the 'symboling' insurance risk rating from 17 car features including price, horsepower, bore, and curb-weight, amongst other continous variables. Symboling is defined as an integer value (whole number), ranging from -3 to +3.
 ## Intended Uses & Limitations
-This model is for educational purposes only. It is not suitable for production use because the dataset is small (only 200 or so entries), outdated (1980s), and contained a lot of missing values (41 missing normalized-losses, around 20% of all rows had a missing normalized-losses entry). Predictions should not be used for real insurance predictions.
 ## Training Data
 Data source: UCI Automobile dataset (https://archive.ics.uci.edu/dataset/10/automobile). Contains ~200 cars with mixed numeric and categorical features. Missing values were imputed using MICE.
@@ -18,7 +18,7 @@ Data source: UCI Automobile dataset (https://archive.ics.uci.edu/dataset/10/auto
 - RMSE: 0.713
 ## Ethical Considerations
-The 'symboling' risk value is not only determined by continous, but categorical variables as well, which the model does not account for. While things such as horsepower, bore, engine-size, and number of doors are good predictors, insurance companies also use brands of cars and the type of car (luxury, sport, etc), as well as a variety of other variables to help determine risk factors.
 ## Audit Questions
 - What features most strongly influence predictions?

 This is a Linear Regression model trained on the UCI Automobile dataset to predict the 'symboling' insurance risk rating from 17 car features including price, horsepower, bore, and curb-weight, amongst other continous variables. Symboling is defined as an integer value (whole number), ranging from -3 to +3.
 ## Intended Uses & Limitations
+This model is for educational purposes only. It is not suitable for production use because the dataset is small (only 200 or so entries), outdated (1980s), and contained a lot of missing values (41 missing normalized-losses, around 20% of all rows had a missing normalized-losses entry). While the missing data was imputated, predictions should not be used for real insurance predictions.
 ## Training Data
 Data source: UCI Automobile dataset (https://archive.ics.uci.edu/dataset/10/automobile). Contains ~200 cars with mixed numeric and categorical features. Missing values were imputed using MICE.
 - RMSE: 0.713
 ## Ethical Considerations
+The 'symboling' risk value is not only determined by continous, but categorical variables as well, which the model does not account for. While things such as horsepower, bore, engine-size, and number of doors are good predictors, insurance companies also use brands of cars and the type of car (luxury, sport, etc), as well as a variety of other variables to help determine risk factors. Because the model does not take these variables into account, it is very unreliable.
 ## Audit Questions
 - What features most strongly influence predictions?