anonymous12321
/

CouncilTopics-PT

@@ -6,7 +6,6 @@ colorTo: blue
 sdk: docker
 app_port: 8501
 tags:
-- streamlit
 - text-classification
 - multilabel-classification
 - portuguese
@@ -24,7 +23,7 @@ base_model:
 ## Model Description
-**Intelligent Stacking** is an advanced ensemble learning system specialized in multilabel classification of Portuguese administrative documents. The model combines 12 base models with intelligent meta-learning to achieve state-of-the-art performance on municipal and governmental document categorization tasks.
 **Try out the model**: [Hugging Face Space Demo](https://huggingface.co/spaces/YOUR_USERNAME/intelligent-stacking-demo)
@@ -33,7 +32,7 @@ base_model:
 - 🧠 **Intelligent Meta-Learning**: Advanced ensemble combination using stacked generalization
 - 📚 **12 Base Models**: 3 feature sets × 4 algorithms for robust predictions
 - 🇵🇹 **Portuguese Optimized**: Fine-tuned for Portuguese administrative language
-- ⚡ **High Performance**: F1-macro score of 0.5486 with 54.7% improvement over baseline
 - 🏢 **22 Categories**: Comprehensive municipal administrative document classification
 - 🎯 **Dynamic Thresholds**: Optimized per-category decision boundaries
@@ -65,71 +64,6 @@ The Intelligent Stacking system operates in multiple stages:
 4. **Dynamic Thresholds**: Per-category optimized decision boundaries for multilabel output
-## Usage
-### Quick Start with Python
-```python
-import joblib
-import numpy as np
-from sklearn.feature_extraction.text import TfidfVectorizer
-from scipy.sparse import hstack, csr_matrix
-# Load the model components
-tfidf_vectorizer = joblib.load("int_stacking_tfidf_vectorizer.joblib")
-meta_learner = joblib.load("int_stacking_meta_learner.joblib")
-mlb_encoder = joblib.load("int_stacking_mlb_encoder.joblib")
-base_models = joblib.load("int_stacking_base_models.joblib")
-optimal_thresholds = np.load("int_stacking_optimal_thresholds.npy")
-# Prepare text
-text = """CONTRATO DE PRESTAÇÃO DE SERVIÇOS
-Entre a Administração Pública Municipal e a empresa contratada,
-fica estabelecido o presente contrato para prestação de serviços
-de manutenção e conservação de vias públicas."""
-# Extract features
-tfidf_features = tfidf_vectorizer.transform([text])
-# Generate base model predictions
-base_predictions = np.zeros((1, len(mlb_encoder.classes_), 12))
-model_idx = 0
-for feat_name in ["TF-IDF", "BERT", "TF-IDF+BERT"]:
-    for algo_name in ["LogReg_C1", "LogReg_C05", "GradBoost", "RandomForest"]:
-        model_key = f"{feat_name}_{algo_name}"
-        if model_key in base_models:
-            model = base_models[model_key]
-            pred = model.predict_proba(tfidf_features)
-            base_predictions[0, :, model_idx] = pred[0]
-        model_idx += 1
-# Meta-learner prediction
-meta_features = base_predictions.reshape(1, -1)
-meta_pred = meta_learner.predict_proba(meta_features)[0]
-# Apply dynamic thresholds
-predicted_labels = []
-for i, (prob, threshold) in enumerate(zip(meta_pred, optimal_thresholds)):
-    if prob > threshold:
-        predicted_labels.append({
-            "label": mlb_encoder.classes_[i],
-            "probability": float(prob),
-            "confidence": "high" if prob > 0.7 else "medium" if prob > 0.4 else "low"
-        })
-# Sort by probability
-predicted_labels.sort(key=lambda x: x["probability"], reverse=True)
-print("Predicted categories:", predicted_labels)
-```
-### Streamlit Demo
-The model includes a complete Streamlit web interface for easy testing:
-```bash
-streamlit run app.py
-```
 ## Categories
@@ -173,7 +107,6 @@ The model classifies documents into 22 Portuguese administrative categories:
 | **Hamming Loss** | **0.0426** | Label-wise error rate |
 | **Average Precision (macro)** | **0.608** | Macro-averaged AP |
 | **Average Precision (micro)** | **0.785** | Micro-averaged AP |
-| **Improvement** | **+54.7%** | Over Decision Tree baseline |
 ## Technical Architecture
@@ -212,19 +145,6 @@ The model was trained on a curated dataset of Portuguese administrative document
 - **Threshold Sensitivity**: Performance depends on carefully tuned per-category thresholds
 - **Class Imbalance**: Some categories may have lower precision due to limited training examples
-## Citation
-If you use this model in your research, please cite:
-```bibtex
-@article{intelligent_stacking_2024,
-  title={Intelligent Stacking for Multilabel Portuguese Administrative Document Classification},
-  author={[Your Name]},
-  journal={[Journal Name]},
-  year={2024},
-  note={Model available at https://huggingface.co/YOUR_USERNAME/intelligent-stacking}
-}
-```
 ## License

 sdk: docker
 app_port: 8501
 tags:
 - text-classification
 - multilabel-classification
 - portuguese
 ## Model Description
+**Intelligent Stacking** is an advanced ensemble learning system specialized in multilabel classification of Portuguese administrative documents. The model combines 12 base models with intelligent meta-learning to achieve high performance on municipal and governmental document categorization tasks.
 **Try out the model**: [Hugging Face Space Demo](https://huggingface.co/spaces/YOUR_USERNAME/intelligent-stacking-demo)
 - 🧠 **Intelligent Meta-Learning**: Advanced ensemble combination using stacked generalization
 - 📚 **12 Base Models**: 3 feature sets × 4 algorithms for robust predictions
 - 🇵🇹 **Portuguese Optimized**: Fine-tuned for Portuguese administrative language
+- ⚡ **High Performance**: F1-macro score of 0.5486
 - 🏢 **22 Categories**: Comprehensive municipal administrative document classification
 - 🎯 **Dynamic Thresholds**: Optimized per-category decision boundaries
 4. **Dynamic Thresholds**: Per-category optimized decision boundaries for multilabel output
 ## Categories
 | **Hamming Loss** | **0.0426** | Label-wise error rate |
 | **Average Precision (macro)** | **0.608** | Macro-averaged AP |
 | **Average Precision (micro)** | **0.785** | Micro-averaged AP |
 ## Technical Architecture
 - **Threshold Sensitivity**: Performance depends on carefully tuned per-category thresholds
 - **Class Imbalance**: Some categories may have lower precision due to limited training examples
 ## License