|
|
--- |
|
|
title: Council Topics Classifier |
|
|
emoji: ๐๏ธ |
|
|
colorFrom: blue |
|
|
colorTo: green |
|
|
sdk: streamlit |
|
|
sdk_version: 1.36.0 |
|
|
app_file: src/streamlit_app.py |
|
|
pinned: false |
|
|
license: cc-by-4.0 |
|
|
--- |
|
|
|
|
|
# ๐๏ธ Council Topics Classifier |
|
|
|
|
|
**Council Topics Classifier** is a system for automatically identifying topics in **Portuguese municipal meeting minutes discussion subjects**. |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ฏ About |
|
|
|
|
|
This demo showcases the classifier's ability to: |
|
|
- Detect topics in Portuguese municipal texts discussion subjects |
|
|
- Use a hybrid feature set (TF-IDF + BERTimbau embeddings) |
|
|
- Combine Logistic Regression and Gradient Boosting models in an adaptive weighted ensemble |
|
|
- Apply dynamic thresholds optimized per topic |
|
|
- Handle unbalanced topic distributions with active learning |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Model Performance |
|
|
|
|
|
- **Model Architecture**: Logistic Regression + 3x Gradient Boosting models |
|
|
- **Features**: TF-IDF (1โ3 n-grams) + BERTimbau contextual embeddings |
|
|
- **Adaptive weighting**: Rare topics get higher LogReg weight, common topics get higher GB weight |
|
|
- **Dynamic thresholds**: Optimized per topic using validation data |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Usage |
|
|
|
|
|
1. **Try Your Own Text**: Paste Portuguese municipal text in the input area |
|
|
2. **Demo Examples**: Select from pre-loaded examples to see topic predictions |
|
|
3. **View Results**: Confidence scores for each predicted topic are displayed interactively |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ง Running Locally |
|
|
|
|
|
```bash |
|
|
pip install -r requirements.txt |
|
|
streamlit run app.py |
|
|
|