Upload folder using huggingface_hub
Browse files- .ipynb_checkpoints/README-checkpoint.md +55 -0
- README.md +55 -0
- auto_testing.csv +42 -0
- config.json +117 -0
- model.pkl +3 -0
- predicted_vs_actual.png +0 -0
- residuals_plot.png +0 -0
.ipynb_checkpoints/README-checkpoint.md
ADDED
|
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
# Model Card
|
| 3 |
+
|
| 4 |
+
## Model Card Authors
|
| 5 |
+
Mathew
|
| 6 |
+
|
| 7 |
+
## Model Description
|
| 8 |
+
This is a Linear Regression model trained on the UCI Automobile dataset to predict the 'symboling' insurance risk rating from 17 car features including price, horsepower, bore, and curb-weight, amongst other continous variables.
|
| 9 |
+
|
| 10 |
+
## Intended Uses & Limitations
|
| 11 |
+
This model is for educational purposes only. It is not suitable for production use because the dataset is small (only ~200 or so entries), outdated (~1980s), and contained a lot of missing values (41 missing normalized-losses, around 20% of all rows had a missing normalized-losses entry). Predictions should not be used for real insurance predictions.
|
| 12 |
+
|
| 13 |
+
## Training Data
|
| 14 |
+
Data source: UCI Automobile dataset (https://archive.ics.uci.edu/dataset/10/automobile). Contains ~200 cars with mixed numeric and categorical features. Missing values were imputed using MICE.
|
| 15 |
+
|
| 16 |
+
## Evaluation Metrics
|
| 17 |
+
- R2: 0.603
|
| 18 |
+
- RMSE: 0.713
|
| 19 |
+
|
| 20 |
+
## Ethical Considerations
|
| 21 |
+
The 'symboling' risk value is not only determined by continous, but categorical variables as well, which the model does not account for. While things such as horsepower, bore, engine-size, and number of doors are good predictors, insurance companies also use brands of cars and the type of car (luxury, sport, etc), as well as a variety of other variables to help determine risk factors.
|
| 22 |
+
|
| 23 |
+
## Audit Questions
|
| 24 |
+
- What features most strongly influence predictions?
|
| 25 |
+
- Are residuals randomly scattered or patterned?
|
| 26 |
+
- How reliable are the evaluation metrics?
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
## Coefficients
|
| 30 |
+
| features | coefficients |
|
| 31 |
+
|:------------------|---------------:|
|
| 32 |
+
| price | -1.73704e-05 |
|
| 33 |
+
| highway-mpg | 0.0438076 |
|
| 34 |
+
| city-mpg | -0.0610687 |
|
| 35 |
+
| peak-rpm | -5.49499e-05 |
|
| 36 |
+
| horsepower | 0.00207246 |
|
| 37 |
+
| compression-ratio | 0.0187334 |
|
| 38 |
+
| stroke | -0.555667 |
|
| 39 |
+
| bore | -0.827261 |
|
| 40 |
+
| engine-size | 0.013724 |
|
| 41 |
+
| num-of-cylinders | -0.498651 |
|
| 42 |
+
| curb-weight | -5.04019e-05 |
|
| 43 |
+
| height | 0.0239754 |
|
| 44 |
+
| width | 0.195005 |
|
| 45 |
+
| length | 0.0120506 |
|
| 46 |
+
| wheel-base | -0.153431 |
|
| 47 |
+
| num-of-doors | -0.428882 |
|
| 48 |
+
| normalized-losses | 0.0116676 |
|
| 49 |
+
|
| 50 |
+
## Plots
|
| 51 |
+
### Predicted vs Actual
|
| 52 |
+

|
| 53 |
+
|
| 54 |
+
### Residuals Plot
|
| 55 |
+

|
README.md
ADDED
|
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
# Model Card
|
| 3 |
+
|
| 4 |
+
## Model Card Authors
|
| 5 |
+
Mathew
|
| 6 |
+
|
| 7 |
+
## Model Description
|
| 8 |
+
This is a Linear Regression model trained on the UCI Automobile dataset to predict the 'symboling' insurance risk rating from 17 car features including price, horsepower, bore, and curb-weight, amongst other continous variables.
|
| 9 |
+
|
| 10 |
+
## Intended Uses & Limitations
|
| 11 |
+
This model is for educational purposes only. It is not suitable for production use because the dataset is small (only ~200 or so entries), outdated (~1980s), and contained a lot of missing values (41 missing normalized-losses, around 20% of all rows had a missing normalized-losses entry). Predictions should not be used for real insurance predictions.
|
| 12 |
+
|
| 13 |
+
## Training Data
|
| 14 |
+
Data source: UCI Automobile dataset (https://archive.ics.uci.edu/dataset/10/automobile). Contains ~200 cars with mixed numeric and categorical features. Missing values were imputed using MICE.
|
| 15 |
+
|
| 16 |
+
## Evaluation Metrics
|
| 17 |
+
- R2: 0.603
|
| 18 |
+
- RMSE: 0.713
|
| 19 |
+
|
| 20 |
+
## Ethical Considerations
|
| 21 |
+
The 'symboling' risk value is not only determined by continous, but categorical variables as well, which the model does not account for. While things such as horsepower, bore, engine-size, and number of doors are good predictors, insurance companies also use brands of cars and the type of car (luxury, sport, etc), as well as a variety of other variables to help determine risk factors.
|
| 22 |
+
|
| 23 |
+
## Audit Questions
|
| 24 |
+
- What features most strongly influence predictions?
|
| 25 |
+
- Are residuals randomly scattered or patterned?
|
| 26 |
+
- How reliable are the evaluation metrics?
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
## Coefficients
|
| 30 |
+
| features | coefficients |
|
| 31 |
+
|:------------------|---------------:|
|
| 32 |
+
| price | -1.73704e-05 |
|
| 33 |
+
| highway-mpg | 0.0438076 |
|
| 34 |
+
| city-mpg | -0.0610687 |
|
| 35 |
+
| peak-rpm | -5.49499e-05 |
|
| 36 |
+
| horsepower | 0.00207246 |
|
| 37 |
+
| compression-ratio | 0.0187334 |
|
| 38 |
+
| stroke | -0.555667 |
|
| 39 |
+
| bore | -0.827261 |
|
| 40 |
+
| engine-size | 0.013724 |
|
| 41 |
+
| num-of-cylinders | -0.498651 |
|
| 42 |
+
| curb-weight | -5.04019e-05 |
|
| 43 |
+
| height | 0.0239754 |
|
| 44 |
+
| width | 0.195005 |
|
| 45 |
+
| length | 0.0120506 |
|
| 46 |
+
| wheel-base | -0.153431 |
|
| 47 |
+
| num-of-doors | -0.428882 |
|
| 48 |
+
| normalized-losses | 0.0116676 |
|
| 49 |
+
|
| 50 |
+
## Plots
|
| 51 |
+
### Predicted vs Actual
|
| 52 |
+

|
| 53 |
+
|
| 54 |
+
### Residuals Plot
|
| 55 |
+

|
auto_testing.csv
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
,price,highway-mpg,city-mpg,peak-rpm,horsepower,compression-ratio,stroke,bore,engine-size,num-of-cylinders,curb-weight,height,width,length,wheel-base,num-of-doors,normalized-losses,symboling
|
| 2 |
+
111,15580.0,24,19,5000.0,95.0,8.4,2.19,3.46,120,4,3075,56.7,68.4,186.7,107.9,4.0,161.0,0
|
| 3 |
+
17,36880.0,20,15,5400.0,182.0,8.0,3.39,3.62,209,6,3505,56.3,70.9,197.0,110.0,4.0,128.0,0
|
| 4 |
+
116,17950.0,33,28,4150.0,95.0,21.0,3.52,3.7,152,4,3252,56.7,68.4,186.7,107.9,4.0,161.0,0
|
| 5 |
+
6,17710.0,25,19,5500.0,110.0,8.5,3.4,3.19,136,5,2844,55.7,71.4,192.7,105.8,4.0,158.0,1
|
| 6 |
+
9,17710.0,22,16,5500.0,160.0,7.0,3.4,3.13,131,5,3053,52.0,67.9,178.2,99.5,2.0,197.0,0
|
| 7 |
+
141,7126.0,37,32,4800.0,82.0,9.5,2.64,3.62,108,4,2145,52.5,65.4,172.0,97.2,4.0,102.0,0
|
| 8 |
+
197,16515.0,28,24,5400.0,114.0,9.5,3.15,3.78,141,4,3042,57.5,67.2,188.8,104.3,4.0,74.0,-1
|
| 9 |
+
185,8195.0,34,27,5250.0,85.0,9.0,3.4,3.19,109,4,2212,55.7,65.5,171.7,97.3,4.0,94.0,2
|
| 10 |
+
196,15985.0,28,24,5400.0,114.0,9.5,3.15,3.78,141,4,2935,56.2,67.2,188.8,104.3,4.0,103.0,-2
|
| 11 |
+
127,34028.0,25,17,5900.0,207.0,9.5,2.9,3.74,194,6,2756,51.6,65.0,168.9,89.5,2.0,145.0,3
|
| 12 |
+
183,7975.0,34,27,5250.0,85.0,9.0,3.4,3.19,109,4,2209,55.7,65.5,171.7,97.3,2.0,122.0,2
|
| 13 |
+
62,10245.0,32,26,4800.0,84.0,8.6,3.39,3.39,122,4,2410,55.5,66.5,177.8,98.8,4.0,115.0,0
|
| 14 |
+
187,9495.0,42,37,4500.0,68.0,23.0,3.4,3.01,97,4,2319,55.7,65.5,171.7,97.3,4.0,94.0,2
|
| 15 |
+
20,6575.0,43,38,5400.0,70.0,9.6,3.11,3.03,90,4,1909,52.0,63.6,158.8,94.5,4.0,81.0,0
|
| 16 |
+
135,15510.0,28,21,5250.0,110.0,9.3,3.07,3.54,121,4,2758,56.1,66.5,186.6,99.1,4.0,104.0,2
|
| 17 |
+
162,9258.0,34,28,4800.0,70.0,9.0,3.03,3.19,98,4,2140,52.8,64.4,166.3,95.7,4.0,91.0,0
|
| 18 |
+
45,6338.0,43,38,5400.0,70.0,9.6,3.11,3.03,90,4,1909,52.0,63.6,155.9,94.5,4.0,83.0,0
|
| 19 |
+
83,14869.0,24,19,5000.0,145.0,7.0,3.86,3.59,156,4,2921,50.2,66.3,173.2,95.9,2.0,164.0,3
|
| 20 |
+
129,35056.0,28,17,5750.0,288.0,10.0,3.11,3.94,203,8,3366,50.5,72.3,175.7,98.4,2.0,188.0,1
|
| 21 |
+
2,16500.0,26,19,5000.0,154.0,9.0,3.47,2.68,152,6,2823,52.4,65.5,171.2,94.5,2.0,145.0,1
|
| 22 |
+
40,10295.0,33,27,5800.0,86.0,9.0,3.58,3.15,110,4,2372,54.1,62.5,175.4,96.5,4.0,85.0,0
|
| 23 |
+
52,6795.0,38,31,5000.0,68.0,9.0,3.15,3.03,91,4,1905,54.1,64.2,159.1,93.1,2.0,104.0,1
|
| 24 |
+
75,16503.0,24,19,5000.0,175.0,8.0,3.12,3.78,140,4,2910,54.8,68.0,178.4,102.7,2.0,158.0,1
|
| 25 |
+
13,21105.0,28,21,4250.0,121.0,9.0,3.19,3.31,164,6,2765,54.3,64.8,176.8,101.2,4.0,188.0,0
|
| 26 |
+
171,11549.0,30,24,4800.0,116.0,9.3,3.5,3.62,146,4,2714,52.0,65.6,176.2,98.4,2.0,134.0,2
|
| 27 |
+
21,5572.0,41,37,5500.0,68.0,9.41,3.23,2.97,90,4,1876,50.8,63.8,157.3,93.7,2.0,118.0,1
|
| 28 |
+
54,7395.0,38,31,5000.0,68.0,9.0,3.15,3.08,91,4,1950,54.1,64.2,166.8,93.1,4.0,113.0,1
|
| 29 |
+
42,10345.0,31,25,5500.0,100.0,9.1,3.58,3.15,110,4,2293,51.0,66.0,169.1,96.5,2.0,107.0,1
|
| 30 |
+
194,12940.0,28,23,5400.0,114.0,9.5,3.15,3.78,141,4,2912,56.2,67.2,188.8,104.3,4.0,103.0,-2
|
| 31 |
+
202,21485.0,23,18,5500.0,134.0,8.8,2.87,3.58,173,6,3012,55.5,68.9,188.8,109.1,4.0,95.0,-1
|
| 32 |
+
156,6938.0,37,30,4800.0,70.0,9.0,3.03,3.19,98,4,2081,53.0,64.4,166.3,95.7,4.0,91.0,0
|
| 33 |
+
198,18420.0,22,17,5100.0,162.0,7.5,3.15,3.62,130,4,3045,56.2,67.2,188.8,104.3,4.0,103.0,-2
|
| 34 |
+
150,5348.0,39,35,4800.0,62.0,9.0,3.03,3.05,92,4,1985,54.5,63.6,158.7,95.7,2.0,87.0,1
|
| 35 |
+
147,10198.0,31,25,5200.0,94.0,9.0,2.64,3.62,108,4,2455,53.0,65.4,173.5,97.0,4.0,89.0,0
|
| 36 |
+
19,6295.0,43,38,5400.0,70.0,9.6,3.11,3.03,90,4,1874,52.0,63.6,155.9,94.5,2.0,98.0,1
|
| 37 |
+
108,13200.0,33,28,4150.0,95.0,21.0,3.52,3.7,152,4,3197,56.7,68.4,186.7,107.9,4.0,161.0,0
|
| 38 |
+
168,9639.0,30,24,4800.0,116.0,9.3,3.5,3.62,146,4,2536,52.0,65.6,176.2,98.4,2.0,134.0,2
|
| 39 |
+
22,6377.0,38,31,5500.0,68.0,9.4,3.23,2.97,90,4,1876,50.8,63.8,157.3,93.7,2.0,118.0,1
|
| 40 |
+
140,7603.0,31,26,4400.0,73.0,8.7,2.64,3.62,108,4,2240,55.7,63.8,157.3,93.3,2.0,83.0,2
|
| 41 |
+
199,18950.0,22,17,5100.0,162.0,7.5,3.15,3.62,130,4,3157,57.5,67.2,188.8,104.3,4.0,74.0,-1
|
| 42 |
+
155,8778.0,32,27,4800.0,62.0,9.0,3.03,3.05,92,4,3110,59.1,63.6,169.7,95.7,4.0,91.0,0
|
config.json
ADDED
|
@@ -0,0 +1,117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"sklearn": {
|
| 3 |
+
"columns": [
|
| 4 |
+
"price",
|
| 5 |
+
"highway-mpg",
|
| 6 |
+
"city-mpg",
|
| 7 |
+
"peak-rpm",
|
| 8 |
+
"horsepower",
|
| 9 |
+
"compression-ratio",
|
| 10 |
+
"stroke",
|
| 11 |
+
"bore",
|
| 12 |
+
"engine-size",
|
| 13 |
+
"num-of-cylinders",
|
| 14 |
+
"curb-weight",
|
| 15 |
+
"height",
|
| 16 |
+
"width",
|
| 17 |
+
"length",
|
| 18 |
+
"wheel-base",
|
| 19 |
+
"num-of-doors",
|
| 20 |
+
"normalized-losses"
|
| 21 |
+
],
|
| 22 |
+
"environment": [
|
| 23 |
+
"scikit-learn=1.0.2"
|
| 24 |
+
],
|
| 25 |
+
"example_input": {
|
| 26 |
+
"price": [
|
| 27 |
+
13495,
|
| 28 |
+
16500,
|
| 29 |
+
13950
|
| 30 |
+
],
|
| 31 |
+
"highway-mpg": [
|
| 32 |
+
29,
|
| 33 |
+
28,
|
| 34 |
+
31
|
| 35 |
+
],
|
| 36 |
+
"city-mpg": [
|
| 37 |
+
21,
|
| 38 |
+
19,
|
| 39 |
+
24
|
| 40 |
+
],
|
| 41 |
+
"peak-rpm": [
|
| 42 |
+
5000,
|
| 43 |
+
5500,
|
| 44 |
+
4800
|
| 45 |
+
],
|
| 46 |
+
"horsepower": [
|
| 47 |
+
102,
|
| 48 |
+
115,
|
| 49 |
+
110
|
| 50 |
+
],
|
| 51 |
+
"compression-ratio": [
|
| 52 |
+
9.0,
|
| 53 |
+
9.0,
|
| 54 |
+
9.0
|
| 55 |
+
],
|
| 56 |
+
"stroke": [
|
| 57 |
+
3.4,
|
| 58 |
+
3.4,
|
| 59 |
+
3.2
|
| 60 |
+
],
|
| 61 |
+
"bore": [
|
| 62 |
+
3.47,
|
| 63 |
+
3.01,
|
| 64 |
+
3.19
|
| 65 |
+
],
|
| 66 |
+
"engine-size": [
|
| 67 |
+
109,
|
| 68 |
+
136,
|
| 69 |
+
120
|
| 70 |
+
],
|
| 71 |
+
"num-of-cylinders": [
|
| 72 |
+
4,
|
| 73 |
+
4,
|
| 74 |
+
4
|
| 75 |
+
],
|
| 76 |
+
"curb-weight": [
|
| 77 |
+
2548,
|
| 78 |
+
2823,
|
| 79 |
+
2507
|
| 80 |
+
],
|
| 81 |
+
"height": [
|
| 82 |
+
54.3,
|
| 83 |
+
55.1,
|
| 84 |
+
54.5
|
| 85 |
+
],
|
| 86 |
+
"width": [
|
| 87 |
+
64.1,
|
| 88 |
+
65.5,
|
| 89 |
+
66.2
|
| 90 |
+
],
|
| 91 |
+
"length": [
|
| 92 |
+
168.8,
|
| 93 |
+
171.2,
|
| 94 |
+
176.6
|
| 95 |
+
],
|
| 96 |
+
"wheel-base": [
|
| 97 |
+
94.5,
|
| 98 |
+
94.5,
|
| 99 |
+
96.5
|
| 100 |
+
],
|
| 101 |
+
"num-of-doors": [
|
| 102 |
+
4.0,
|
| 103 |
+
2.0,
|
| 104 |
+
4.0
|
| 105 |
+
],
|
| 106 |
+
"normalized-losses": [
|
| 107 |
+
65,
|
| 108 |
+
103,
|
| 109 |
+
74
|
| 110 |
+
]
|
| 111 |
+
},
|
| 112 |
+
"model": {
|
| 113 |
+
"file": "model.pkl"
|
| 114 |
+
},
|
| 115 |
+
"task": "tabular-regression"
|
| 116 |
+
}
|
| 117 |
+
}
|
model.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:397d94c940c067e31b74c2026c14e70ebed350169a6a9ab44405a86b48a254d6
|
| 3 |
+
size 10632
|
predicted_vs_actual.png
ADDED
|
residuals_plot.png
ADDED
|