Spaces:
Sleeping
Sleeping
Commit ·
d7d15c8
1
Parent(s): 6c1f3ec
Changed to DeBERTa model
Browse files
app.py
CHANGED
|
@@ -108,7 +108,7 @@ ENTITY_COLORS = {
|
|
| 108 |
@st.cache_resource
|
| 109 |
def load_model():
|
| 110 |
"""Load the VotIE model and tokenizer."""
|
| 111 |
-
model_name = "Anonymous3445/
|
| 112 |
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
| 113 |
model = AutoModel.from_pretrained(model_name, trust_remote_code=True)
|
| 114 |
return tokenizer, model
|
|
@@ -341,7 +341,7 @@ def format_event(event: Dict[str, Any]) -> None:
|
|
| 341 |
def main():
|
| 342 |
# Header - centered with large title
|
| 343 |
st.markdown("<h1 style='text-align: center; font-size: 2.8rem; font-weight: 700; margin-bottom: 0.5rem;'>🗳️ VotIE: Voting Information Extraction</h1>", unsafe_allow_html=True)
|
| 344 |
-
st.markdown("<p style='text-align: center; font-size: 1.2rem; color: #666; margin-bottom: 2rem;'>VotIE extracts structured voting information from Portuguese text using <strong>
|
| 345 |
|
| 346 |
# Sidebar
|
| 347 |
with st.sidebar:
|
|
@@ -401,7 +401,7 @@ def main():
|
|
| 401 |
)
|
| 402 |
|
| 403 |
st.markdown("---")
|
| 404 |
-
st.markdown("**Model**: [
|
| 405 |
|
| 406 |
# Main content area - unified layout
|
| 407 |
st.markdown("---")
|
|
@@ -410,8 +410,8 @@ def main():
|
|
| 410 |
with st.expander("🔧 How It Works", expanded=False):
|
| 411 |
st.markdown("""
|
| 412 |
**Process**:
|
| 413 |
-
1. **Tokenization**: Text is split into tokens using
|
| 414 |
-
2. **Entity Recognition**: Each token is classified using
|
| 415 |
3. **Token Classification**: Tokens are labeled with BIO tags (B-SUBJECT, I-VOTING, etc.)
|
| 416 |
4. **Event Construction**: Labeled entities are grouped into structured voting events
|
| 417 |
5. **Outcome Determination**: System infers voting results from extracted data
|
|
|
|
| 108 |
@st.cache_resource
|
| 109 |
def load_model():
|
| 110 |
"""Load the VotIE model and tokenizer."""
|
| 111 |
+
model_name = "Anonymous3445/XLM-RoBERTa-CRF-VotIE"
|
| 112 |
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
| 113 |
model = AutoModel.from_pretrained(model_name, trust_remote_code=True)
|
| 114 |
return tokenizer, model
|
|
|
|
| 341 |
def main():
|
| 342 |
# Header - centered with large title
|
| 343 |
st.markdown("<h1 style='text-align: center; font-size: 2.8rem; font-weight: 700; margin-bottom: 0.5rem;'>🗳️ VotIE: Voting Information Extraction</h1>", unsafe_allow_html=True)
|
| 344 |
+
st.markdown("<p style='text-align: center; font-size: 1.2rem; color: #666; margin-bottom: 2rem;'>VotIE extracts structured voting information from Portuguese text using <strong>XLM-RoBERTa</strong> + CRF layer.</p>", unsafe_allow_html=True)
|
| 345 |
|
| 346 |
# Sidebar
|
| 347 |
with st.sidebar:
|
|
|
|
| 401 |
)
|
| 402 |
|
| 403 |
st.markdown("---")
|
| 404 |
+
st.markdown("**Model**: [XLM-RoBERTa-CRF-VotIE](https://huggingface.co/Anonymous3445/XLM-RoBERTa-CRF-VotIE)")
|
| 405 |
|
| 406 |
# Main content area - unified layout
|
| 407 |
st.markdown("---")
|
|
|
|
| 410 |
with st.expander("🔧 How It Works", expanded=False):
|
| 411 |
st.markdown("""
|
| 412 |
**Process**:
|
| 413 |
+
1. **Tokenization**: Text is split into tokens using XLM-RoBERTa tokenizer
|
| 414 |
+
2. **Entity Recognition**: Each token is classified using XLM-RoBERTa + CRF model
|
| 415 |
3. **Token Classification**: Tokens are labeled with BIO tags (B-SUBJECT, I-VOTING, etc.)
|
| 416 |
4. **Event Construction**: Labeled entities are grouped into structured voting events
|
| 417 |
5. **Outcome Determination**: System infers voting results from extracted data
|