Update README.md
Browse files
README.md
CHANGED
|
@@ -23,7 +23,7 @@ base_model:
|
|
| 23 |
|
| 24 |
## Model Description
|
| 25 |
|
| 26 |
-
**Municipal Topics Classifier** is an ensemble machine learning system specialized in **multi-label topic classification** for Portuguese municipal council meeting minutes. The model combines Gradient Boosting with Active Learning and BERTimbau embeddings to identify multiple simultaneous topics within municipal discussion subbjects, making it particularly effective for categorizing complex governmental content.
|
| 27 |
|
| 28 |
🚀 **Try out the model:** [Hugging Face Space Demo](https://huggingface.co/spaces/anonymous12321/GB_CouncilTopics-PT)
|
| 29 |
|
|
@@ -75,22 +75,6 @@ The model processes Portuguese municipal texts through a sophisticated pipeline
|
|
| 75 |
- Per-label optimal thresholds (not fixed 0.5)
|
| 76 |
- Optimized for F1-score on validation set
|
| 77 |
|
| 78 |
-
### Example
|
| 79 |
-
|
| 80 |
-
**Input:**
|
| 81 |
-
```
|
| 82 |
-
A Câmara Municipal aprovou o orçamento de 2024 com investimentos em infraestruturas
|
| 83 |
-
e transportes públicos. O vereador apresentou uma proposta para melhorar o sistema
|
| 84 |
-
de recolha de resíduos.
|
| 85 |
-
```
|
| 86 |
-
|
| 87 |
-
**Output:**
|
| 88 |
-
```
|
| 89 |
-
Orçamento e Finanças (Confidence: 89%)
|
| 90 |
-
Obras Públicas (Confidence: 76%)
|
| 91 |
-
Transportes (Confidence: 68%)
|
| 92 |
-
Ambiente (Confidence: 54%)
|
| 93 |
-
```
|
| 94 |
|
| 95 |
## Usage
|
| 96 |
|
|
@@ -136,18 +120,6 @@ predicted_labels = mlb.inverse_transform(predictions)
|
|
| 136 |
print(f"Predicted Topics: {predicted_labels}")
|
| 137 |
```
|
| 138 |
|
| 139 |
-
## Evaluation Results
|
| 140 |
-
|
| 141 |
-
### Test Set Performance
|
| 142 |
-
|
| 143 |
-
| Metric | Score |
|
| 144 |
-
|--------|-------|
|
| 145 |
-
| **Micro F1-Score** | 0.82 |
|
| 146 |
-
| **Macro F1-Score** | 0.74 |
|
| 147 |
-
| **Hamming Loss** | 0.08 |
|
| 148 |
-
| **Subset Accuracy** | 0.45 |
|
| 149 |
-
| **Average Precision** | 0.79 |
|
| 150 |
-
|
| 151 |
|
| 152 |
## Dataset
|
| 153 |
|
|
|
|
| 23 |
|
| 24 |
## Model Description
|
| 25 |
|
| 26 |
+
**Municipal Topics Classifier** is an ensemble machine learning system specialized in **multi-label topic classification** for Portuguese municipal council meeting minutes subjects. The model combines Gradient Boosting with Active Learning and BERTimbau embeddings to identify multiple simultaneous topics within municipal discussion subbjects, making it particularly effective for categorizing complex governmental content.
|
| 27 |
|
| 28 |
🚀 **Try out the model:** [Hugging Face Space Demo](https://huggingface.co/spaces/anonymous12321/GB_CouncilTopics-PT)
|
| 29 |
|
|
|
|
| 75 |
- Per-label optimal thresholds (not fixed 0.5)
|
| 76 |
- Optimized for F1-score on validation set
|
| 77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
|
| 79 |
## Usage
|
| 80 |
|
|
|
|
| 120 |
print(f"Predicted Topics: {predicted_labels}")
|
| 121 |
```
|
| 122 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 123 |
|
| 124 |
## Dataset
|
| 125 |
|