Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- .ipynb_checkpoints/README-checkpoint.md +35 -0
- README.md +35 -0
- config.json +93 -0
- model.pkl +3 -0
- wine_clusters.png +3 -0
- wine_testing.csv +37 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
wine_clusters.png filter=lfs diff=lfs merge=lfs -text
|
.ipynb_checkpoints/README-checkpoint.md
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
# Model Card
|
| 3 |
+
|
| 4 |
+
## Model Card Authors
|
| 5 |
+
Mathew
|
| 6 |
+
|
| 7 |
+
## Model Description
|
| 8 |
+
This is a KMeans clustering model trained on the UCI Wine dataset. The model groups wines into clusters based on 13 chemical analysis features such as alcohol, flavanoids, color intensity, and proline. The dataset has three ground truth classes (wine cultivars; simply called ``classes`` in the dataset), which were used to evaluate clustering performance but not during training. The used K-value was K=3, for 3 different classes.
|
| 9 |
+
|
| 10 |
+
## Intended Uses & Limitations
|
| 11 |
+
This clustering model is for educational purposes only. It is not suitable for production use because the dataset is relatively small (178 samples) and well-structured, which makes clustering easier than in more complex, real-world datasets. Results should not be generalized beyond this dataset.
|
| 12 |
+
|
| 13 |
+
## Training Data
|
| 14 |
+
Data source: UCI Wine dataset (https://archive.ics.uci.edu/dataset/109/wine). The dataset contains 178 wines described by 13 continuous chemical features. Ground truth labels (three ``classes``) were used only for evaluation.
|
| 15 |
+
|
| 16 |
+
## Evaluation Metrics
|
| 17 |
+
- Adjusted Rand Index (ARI): 0.849
|
| 18 |
+
- Normalized Mutual Information (NMI): 0.82
|
| 19 |
+
|
| 20 |
+
## Ethical Considerations
|
| 21 |
+
Clustering models can reveal structure in data but should not be used for decision-making without careful validation. In domains like healthcare or finance, misinterpreting clusters as ground truth categories could lead to harmful conclusions. Here, the Wine dataset is safe for educational use, but the same methods applied to sensitive data would require rigorous ethical review.
|
| 22 |
+
|
| 23 |
+
## Audit Questions
|
| 24 |
+
- How stable are the clusters across different random seeds or initialization methods?
|
| 25 |
+
- Do the clusters correspond meaningfully to the known wine cultivars?
|
| 26 |
+
- How do ARI and NMI compare to supervised classification accuracy?
|
| 27 |
+
- Are there features that dominate clustering outcomes (e.g., alcohol, flavanoids)?
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
## Plots
|
| 31 |
+
### Clusters vs Ground Truth (e.g., PCA projection)
|
| 32 |
+

|
| 33 |
+
|
| 34 |
+
### Silhouette Plot
|
| 35 |
+

|
README.md
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
# Model Card
|
| 3 |
+
|
| 4 |
+
## Model Card Authors
|
| 5 |
+
Mathew
|
| 6 |
+
|
| 7 |
+
## Model Description
|
| 8 |
+
This is a KMeans clustering model trained on the UCI Wine dataset. The model groups wines into clusters based on 13 chemical analysis features such as alcohol, flavanoids, color intensity, and proline. The dataset has three ground truth classes (wine cultivars; simply called ``classes`` in the dataset), which were used to evaluate clustering performance but not during training. The used K-value was K=3, for 3 different classes.
|
| 9 |
+
|
| 10 |
+
## Intended Uses & Limitations
|
| 11 |
+
This clustering model is for educational purposes only. It is not suitable for production use because the dataset is relatively small (178 samples) and well-structured, which makes clustering easier than in more complex, real-world datasets. Results should not be generalized beyond this dataset.
|
| 12 |
+
|
| 13 |
+
## Training Data
|
| 14 |
+
Data source: UCI Wine dataset (https://archive.ics.uci.edu/dataset/109/wine). The dataset contains 178 wines described by 13 continuous chemical features. Ground truth labels (three ``classes``) were used only for evaluation.
|
| 15 |
+
|
| 16 |
+
## Evaluation Metrics
|
| 17 |
+
- Adjusted Rand Index (ARI): 0.849
|
| 18 |
+
- Normalized Mutual Information (NMI): 0.82
|
| 19 |
+
|
| 20 |
+
## Ethical Considerations
|
| 21 |
+
Clustering models can reveal structure in data but should not be used for decision-making without careful validation. In domains like healthcare or finance, misinterpreting clusters as ground truth categories could lead to harmful conclusions. Here, the Wine dataset is safe for educational use, but the same methods applied to sensitive data would require rigorous ethical review.
|
| 22 |
+
|
| 23 |
+
## Audit Questions
|
| 24 |
+
- How stable are the clusters across different random seeds or initialization methods?
|
| 25 |
+
- Do the clusters correspond meaningfully to the known wine cultivars?
|
| 26 |
+
- How do ARI and NMI compare to supervised classification accuracy?
|
| 27 |
+
- Are there features that dominate clustering outcomes (e.g., alcohol, flavanoids)?
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
## Plots
|
| 31 |
+
### Clusters vs Ground Truth (e.g., PCA projection)
|
| 32 |
+

|
| 33 |
+
|
| 34 |
+
### Silhouette Plot
|
| 35 |
+

|
config.json
ADDED
|
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"sklearn": {
|
| 3 |
+
"columns": [
|
| 4 |
+
"Alcohol",
|
| 5 |
+
"Malic acid",
|
| 6 |
+
"Ash",
|
| 7 |
+
"Alcalinity of ash",
|
| 8 |
+
"Magnesium",
|
| 9 |
+
"Total phenols",
|
| 10 |
+
"Flavanoids",
|
| 11 |
+
"Nonflavanoid phenols",
|
| 12 |
+
"Proanthocyanins",
|
| 13 |
+
"Color intensity",
|
| 14 |
+
"Hue",
|
| 15 |
+
"OD280/OD315 of diluted wines",
|
| 16 |
+
"Proline"
|
| 17 |
+
],
|
| 18 |
+
"environment": [
|
| 19 |
+
"scikit-learn=1.0.2"
|
| 20 |
+
],
|
| 21 |
+
"example_input": {
|
| 22 |
+
"Alcohol": [
|
| 23 |
+
14.23,
|
| 24 |
+
13.2,
|
| 25 |
+
13.16
|
| 26 |
+
],
|
| 27 |
+
"Malic acid": [
|
| 28 |
+
1.71,
|
| 29 |
+
1.78,
|
| 30 |
+
2.36
|
| 31 |
+
],
|
| 32 |
+
"Ash": [
|
| 33 |
+
2.43,
|
| 34 |
+
2.14,
|
| 35 |
+
2.67
|
| 36 |
+
],
|
| 37 |
+
"Alcalinity of ash": [
|
| 38 |
+
15.6,
|
| 39 |
+
11.2,
|
| 40 |
+
18.6
|
| 41 |
+
],
|
| 42 |
+
"Magnesium": [
|
| 43 |
+
127,
|
| 44 |
+
100,
|
| 45 |
+
101
|
| 46 |
+
],
|
| 47 |
+
"Total phenols": [
|
| 48 |
+
2.8,
|
| 49 |
+
2.65,
|
| 50 |
+
2.8
|
| 51 |
+
],
|
| 52 |
+
"Flavanoids": [
|
| 53 |
+
3.06,
|
| 54 |
+
2.76,
|
| 55 |
+
3.24
|
| 56 |
+
],
|
| 57 |
+
"Nonflavanoid phenols": [
|
| 58 |
+
0.28,
|
| 59 |
+
0.26,
|
| 60 |
+
0.3
|
| 61 |
+
],
|
| 62 |
+
"Proanthocyanins": [
|
| 63 |
+
2.29,
|
| 64 |
+
1.28,
|
| 65 |
+
2.81
|
| 66 |
+
],
|
| 67 |
+
"Color intensity": [
|
| 68 |
+
5.64,
|
| 69 |
+
4.38,
|
| 70 |
+
5.68
|
| 71 |
+
],
|
| 72 |
+
"Hue": [
|
| 73 |
+
1.04,
|
| 74 |
+
1.05,
|
| 75 |
+
1.03
|
| 76 |
+
],
|
| 77 |
+
"OD280/OD315 of diluted wines": [
|
| 78 |
+
3.92,
|
| 79 |
+
3.4,
|
| 80 |
+
3.17
|
| 81 |
+
],
|
| 82 |
+
"Proline": [
|
| 83 |
+
1065,
|
| 84 |
+
1050,
|
| 85 |
+
1185
|
| 86 |
+
]
|
| 87 |
+
},
|
| 88 |
+
"model": {
|
| 89 |
+
"file": "model.pkl"
|
| 90 |
+
},
|
| 91 |
+
"task": "tabular-classification"
|
| 92 |
+
}
|
| 93 |
+
}
|
model.pkl
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2d0a4a10b7eafe961bc278152391e0bef067307b37489acfc492d3eb033d8ea8
|
| 3 |
+
size 9688
|
wine_clusters.png
ADDED
|
Git LFS Details
|
wine_testing.csv
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
,Alcohol,Malicacid,Ash,Alcalinity_of_ash,Magnesium,Total_phenols,Flavanoids,Nonflavanoid_phenols,Proanthocyanins,Color_intensity,Hue,0D280_0D315_of_diluted_wines,Proline,class
|
| 2 |
+
13,14.75,1.73,2.39,11.4,91,3.1,3.69,0.43,2.81,5.4,1.25,2.73,1150,1
|
| 3 |
+
113,11.41,0.74,2.5,21.0,88,2.48,2.01,0.42,1.44,3.08,1.1,2.31,434,2
|
| 4 |
+
21,12.93,3.8,2.65,18.6,102,2.41,2.41,0.25,1.98,4.5,1.03,3.52,770,1
|
| 5 |
+
143,13.62,4.95,2.35,20.0,92,2.0,0.8,0.47,1.02,4.4,0.91,2.05,550,3
|
| 6 |
+
173,13.71,5.65,2.45,20.5,95,1.68,0.61,0.52,1.06,7.7,0.64,1.74,740,3
|
| 7 |
+
9,13.86,1.35,2.27,16.0,98,2.98,3.15,0.22,1.85,7.22,1.01,3.55,1045,1
|
| 8 |
+
52,13.82,1.75,2.42,14.0,111,3.88,3.74,0.32,1.87,7.05,1.01,3.26,1190,1
|
| 9 |
+
17,13.83,1.57,2.62,20.0,115,2.95,3.4,0.4,1.72,6.6,1.13,2.57,1130,1
|
| 10 |
+
131,12.88,2.99,2.4,20.0,104,1.3,1.22,0.24,0.83,5.4,0.74,1.42,530,3
|
| 11 |
+
2,13.16,2.36,2.67,18.6,101,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185,1
|
| 12 |
+
125,12.07,2.16,2.17,21.0,85,2.6,2.65,0.37,1.35,2.76,0.86,3.28,378,2
|
| 13 |
+
85,12.67,0.98,2.24,18.0,99,2.2,1.94,0.3,1.46,2.62,1.23,3.16,450,2
|
| 14 |
+
150,13.5,3.12,2.62,24.0,123,1.4,1.57,0.22,1.25,8.6,0.59,1.3,500,3
|
| 15 |
+
20,14.06,1.63,2.28,16.0,126,3.0,3.17,0.24,2.1,5.65,1.09,3.71,780,1
|
| 16 |
+
54,13.74,1.67,2.25,16.4,118,2.6,2.9,0.21,1.62,5.85,0.92,3.2,1060,1
|
| 17 |
+
164,13.78,2.76,2.3,22.0,90,1.35,0.68,0.41,1.03,9.58,0.7,1.68,615,3
|
| 18 |
+
144,12.25,3.88,2.2,18.5,112,1.38,0.78,0.29,1.14,8.21,0.65,2.0,855,3
|
| 19 |
+
19,13.64,3.1,2.56,15.2,116,2.7,3.03,0.17,1.66,5.1,0.96,3.36,845,1
|
| 20 |
+
170,12.2,3.03,2.32,19.0,96,1.25,0.49,0.4,0.73,5.5,0.66,1.83,510,3
|
| 21 |
+
45,14.21,4.04,2.44,18.9,111,2.85,2.65,0.3,1.25,5.24,0.87,3.33,1080,1
|
| 22 |
+
42,13.88,1.89,2.59,15.0,101,3.25,3.56,0.17,1.7,5.43,0.88,3.56,1095,1
|
| 23 |
+
154,12.58,1.29,2.1,20.0,103,1.48,0.58,0.53,1.4,7.6,0.58,1.55,640,3
|
| 24 |
+
157,12.45,3.03,2.64,27.0,97,1.9,0.58,0.63,1.14,7.5,0.67,1.73,880,3
|
| 25 |
+
114,12.08,1.39,2.5,22.5,84,2.56,2.29,0.43,1.04,2.9,0.93,3.19,385,2
|
| 26 |
+
75,11.66,1.88,1.92,16.0,97,1.61,1.57,0.34,1.15,3.8,1.23,2.14,428,2
|
| 27 |
+
101,12.6,1.34,1.9,18.5,88,1.45,1.36,0.29,1.35,2.45,1.04,2.77,562,2
|
| 28 |
+
6,14.39,1.87,2.45,14.6,96,2.5,2.52,0.3,1.98,5.25,1.02,3.58,1290,1
|
| 29 |
+
22,13.71,1.86,2.36,16.6,101,2.61,2.88,0.27,1.69,3.8,1.11,4.0,1035,1
|
| 30 |
+
60,12.33,1.1,2.28,16.0,101,2.05,1.09,0.63,0.41,3.27,1.25,1.67,680,2
|
| 31 |
+
40,13.56,1.71,2.31,16.2,117,3.15,3.29,0.34,2.34,6.13,0.95,3.38,795,1
|
| 32 |
+
62,13.67,1.25,1.92,18.0,94,2.1,1.79,0.32,0.73,3.8,1.23,2.46,630,2
|
| 33 |
+
68,13.34,0.94,2.36,17.0,110,2.53,1.3,0.55,0.42,3.17,1.02,1.93,750,2
|
| 34 |
+
149,13.08,3.9,2.36,21.5,113,1.41,1.39,0.34,1.14,9.4,0.57,1.33,550,3
|
| 35 |
+
106,12.25,1.73,2.12,19.0,80,1.65,2.03,0.37,1.63,3.4,1.0,3.17,510,2
|
| 36 |
+
26,13.39,1.77,2.62,16.1,93,2.85,2.94,0.34,1.45,4.8,0.92,3.22,1195,1
|
| 37 |
+
162,12.85,3.27,2.58,22.0,106,1.65,0.6,0.6,0.96,5.58,0.87,2.11,570,3
|