Spaces:

eventdata-utd
/

ConfliBERT-GUI-v3

Running

App Files Files Community

Shreyas Meher commited on 5 days ago

Commit

522f5f8

unverified ·

1 Parent(s): 0adf14d

New version with FT

Browse files

Files changed (15) hide show

README.md +233 -216
app.py +1629 -878
examples/binary/dev.tsv +20 -0
examples/binary/test.tsv +20 -0
examples/binary/train.tsv +81 -0
examples/multiclass/dev.tsv +20 -0
examples/multiclass/test.tsv +20 -0
examples/multiclass/train.tsv +80 -0
requirements.txt +9 -5
screenshots/classification.png +3 -0
screenshots/finetune.png +3 -0
screenshots/home.png +3 -0
screenshots/multilabel.png +3 -0
screenshots/ner.png +3 -0
screenshots/qa.png +3 -0

README.md CHANGED Viewed

@@ -1,216 +1,233 @@
-![ConfliBERT GUI](./gui.png)
-# ConfliBERT GUI Application
-A web-based interface for [ConfliBERT](https://github.com/eventdata/ConfliBERT), a BERT-based model specialized in conflict and political event analysis. This application provides multiple Natural Language Processing capabilities including Named Entity Recognition (NER), Text Classification, Multi-label Classification, and Question Answering.
-## Features
-- **Named Entity Recognition (NER)**
-  - Identifies and classifies named entities in text
-  - Entities include: Organizations, Persons, Locations, Quantities, Weapons, Nationalities, Temporal references, and more
-  - Color-coded visualization of entities in the web interface
-- **Text Classification**
-  - Binary classification for conflict-related content
-  - Determines if text is related to conflict, violence, or politics
-  - Provides confidence scores for classifications
-- **Multi-label Classification**
-  - Categorizes text into multiple event types
-  - Categories include: Armed Assault, Bombing or Explosion, Kidnapping, and Other
-  - Provides confidence scores for each category
-- **Question Answering**
-  - Extracts answers from provided context based on questions
-  - Specialized for conflict-related queries
-## Installation
-### Requirements
-**Required:**
-- Python 3.8+
-- Git
-- Code editor (VS Code recommended)
-**Optional but recommended:**
-- PowerShell 5.0+ (Windows)
-- Terminal (Mac)
-### Installation Steps
-1. Install Python:
-   - Download from [python.org](https://www.python.org/downloads/)
-   - Check installation: `python --version`
-2. Install Git:
-   - Windows: Download from [git-scm.com](https://git-scm.com/downloads)
-   - Mac: `brew install git` or download from [git-scm.com](https://git-scm.com/downloads)
-   - Check installation: `git --version`
-3. Clone repository:
-```bash
-git clone https://github.com/shreyasmeher/conflibert-gui.git
-cd conflibert-gui
-```
-4. Create and activate virtual environment:
-```bash
-# Create environment
-python -m venv env
-# Activate environment
-# On Windows:
-env\Scripts\activate
-# On Mac/Linux:
-source env/bin/activate
-```
-5. For Windows users with permission errors, [run PowerShell as Administrator](https://www.javatpoint.com/powershell-run-as-administrator):
-```powershell
-Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope LocalMachine
-```
-6. Install requirements:
-```bash
-pip install -r requirements.txt
-```
-### Package Requirements
-- Python 3.8+
-- PyTorch
-- TensorFlow
-- Transformers
-- Gradio
-- Pandas
-## Usage
-### Running the Application
-1. Start the application:
-```bash
-python app.py
-```
-2. Open your web browser and navigate to:
-```
-http://localhost:7860
-```
-### Using Different Features
-#### Individual Text Analysis
-1. Select the desired task from the dropdown menu:
-   - Named Entity Recognition
-   - Text Classification
-   - Multilabel Classification
-   - Question Answering
-2. For standard tasks:
-   - Enter your text in the input box
-   - Click Submit
-3. For Question Answering:
-   - Enter the context in the context box
-   - Enter your question in the question box
-   - Click Submit
-#### Batch Processing with CSV
-1. Prepare a CSV file with a 'text' column containing your texts
-2. Select the desired task:
-   - NER
-   - Text Classification
-   - Multilabel Classification
-3. Upload your CSV file using the file upload component
-4. Click Submit to process the entire file
-5. Download the results CSV containing the original text and analysis results
-## Model Information
-ConfliBERT uses several specialized models:
-- **NER Model**: `eventdata-utd/conflibert-named-entity-recognition`
-- **Binary Classification**: `eventdata-utd/conflibert-binary-classification`
-- **Multi-label Classification**: `eventdata-utd/conflibert-satp-relevant-multilabel`
-- **Question Answering**: `salsarra/ConfliBERT-QA`
-## Output Formats
-### NER Output
-```
-EntityType: Entity1, Entity2 || EntityType2: Entity3 | Entity4
-```
-### Binary Classification Output
-```
-Class (Confidence%)
-```
-### Multi-label Classification Output
-```
-Class1 (Confidence%) | Class2 (Confidence%)
-```
-## Technical Details
-### File Structure
-```
-conflibert-gui/
-├── app.py              # Main application file
-├── requirements.txt    # Package dependencies
-└── README.md          # Documentation
-```
-### Key Components
-- **UI Components**: Built using Gradio
-- **Backend Processing**: PyTorch and TensorFlow
-- **Data Processing**: Pandas for CSV handling
-- **Model Integration**: Hugging Face Transformers
-## Contributing
-1. Fork the repository
-2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
-3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
-4. Push to the branch (`git push origin feature/AmazingFeature`)
-5. Open a Pull Request
-## Credits
-Developed by:
-- [Sultan Alsarra](https://www.linkedin.com/in/sultan-alsarra-phd-56977a63/)
-- [Shreyas Meher](http://shreyasmeher.com)
-## License
-This project is licensed under the MIT License - see the LICENSE file for details.
-## Institutional Support
-- [UTD Event Data](https://eventdata.utdallas.edu/)
-- [University of Texas at Dallas](https://www.utdallas.edu/)
-## Citation
-If you use this tool in your research, please cite:
-```bibtex
-@inproceedings{hu2022conflibert,
-  title={ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence},
-  author={Hu, Yibo and Hosseini, MohammadSaleh and Parolin, Erick Skorupa and Osorio, Javier and Khan, Latifur and Brandt, Patrick and D’Orazio, Vito},
-  booktitle={Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
-  pages={5469--5482},
-  year={2022}
-}
-```

+# ConfliBERT
+[ConfliBERT](https://github.com/eventdata/ConfliBERT) is a pretrained language model built specifically for analyzing conflict and political violence text. This application provides a browser-based interface for running inference with ConfliBERT's pretrained models, fine-tuning custom classifiers on your own data, and comparing model performance across architectures.
+Developed by [Shreyas Meher](http://shreyasmeher.com).
+## Screenshots
+### Home
+The landing page shows your system configuration (GPU/CPU, RAM, platform) and an overview of everything the app can do.
+<!-- Take a screenshot of the Home tab and save as screenshots/home.png -->
+![Home](./screenshots/home.png)
+### Named Entity Recognition
+Identifies persons, organizations, locations, weapons, and other entity types. Results are color-coded. Supports single text and CSV batch processing.
+<!-- Take a screenshot of the NER tab with sample output and save as screenshots/ner.png -->
+![NER](./screenshots/ner.png)
+### Binary Classification
+Classifies text as conflict-related or not. Uses the pretrained ConfliBERT classifier by default, or load your own fine-tuned model.
+<!-- Take a screenshot of the Classification tab and save as screenshots/classification.png -->
+![Classification](./screenshots/classification.png)
+### Multilabel Classification
+Scores text against four event categories (Armed Assault, Bombing/Explosion, Kidnapping, Other). Each category is scored independently.
+<!-- Take a screenshot of the Multilabel tab and save as screenshots/multilabel.png -->
+![Multilabel](./screenshots/multilabel.png)
+### Question Answering
+Provide a context passage and a question. The model extracts the most relevant answer span.
+<!-- Take a screenshot of the QA tab and save as screenshots/qa.png -->
+![QA](./screenshots/qa.png)
+### Fine-tuning
+Train your own binary or multiclass classifier directly in the browser. Upload data (or load a built-in example), pick a base model, configure training, and go. After training, results and a "Try Your Model" panel appear side by side. You can also save the model and run batch predictions.
+<!-- Take a screenshot of the Fine-tune tab and save as screenshots/finetune.png -->
+![Fine-tune](./screenshots/finetune.png)
+### Model Comparison
+Compare multiple base model architectures on the same dataset. The comparison produces a metrics table, a grouped bar chart, and ROC-AUC curves.
+## Supported Models
+### Pretrained (Inference)
+| Task | HuggingFace Model |
+|------|-------------------|
+| NER | `eventdata-utd/conflibert-named-entity-recognition` |
+| Binary Classification | `eventdata-utd/conflibert-binary-classification` |
+| Multilabel Classification | `eventdata-utd/conflibert-satp-relevant-multilabel` |
+| Question Answering | `salsarra/ConfliBERT-QA` |
+### Fine-tuning (Base Models)
+| Model | HuggingFace ID | Notes |
+|-------|----------------|-------|
+| ConfliBERT | `snowood1/ConfliBERT-scr-uncased` | Best for conflict/political text |
+| BERT Base Uncased | `bert-base-uncased` | General-purpose baseline |
+| BERT Base Cased | `bert-base-cased` | Case-sensitive variant |
+| RoBERTa Base | `roberta-base` | Improved BERT training |
+| ModernBERT Base | `answerdotai/ModernBERT-base` | Up to 8K token context |
+| DeBERTa v3 Base | `microsoft/deberta-v3-base` | Strong on benchmarks |
+| DistilBERT Base | `distilbert-base-uncased` | Faster, smaller |
+## Installation
+### Requirements
+- Python 3.8+
+- Git
+### Steps
+1. Clone the repository:
+```bash
+git clone https://github.com/shreyasmeher/conflibert-gui.git
+cd conflibert-gui
+```
+2. Create and activate a virtual environment:
+```bash
+python -m venv env
+# Mac/Linux:
+source env/bin/activate
+# Windows:
+env\Scripts\activate
+```
+On Windows, if you get a permission error, run PowerShell as Administrator and execute:
+```powershell
+Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope LocalMachine
+```
+3. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+## Usage
+Start the application:
+```bash
+python app.py
+```
+Opens at `http://localhost:7860` and generates a public shareable link. The first launch takes a minute or two while it downloads the pretrained models.
+### Tabs
+| Tab | What it does |
+|-----|-------------|
+| Home | System info, feature overview, citation |
+| Named Entity Recognition | Identify entities in text or CSV |
+| Binary Classification | Conflict vs. non-conflict, supports custom models |
+| Multilabel Classification | Multi-event-type scoring |
+| Question Answering | Extract answers from a context passage |
+| Fine-tune | Train classifiers, compare models, ROC curves |
+### Fine-tuning Quick Start
+1. Go to the **Fine-tune** tab
+2. Click **"Load Example: Binary"** to load sample data
+3. Leave defaults and click **"Start Training"**
+4. Review metrics and try your model on new text
+5. Save the model and load it in the **Binary Classification** tab
+### Model Comparison Quick Start
+1. Upload data (or load an example) in the **Fine-tune** tab
+2. Scroll down and open **"Compare Multiple Models"**
+3. Check 2 or more models to compare
+4. Click **"Compare Models"**
+5. View the metrics table, bar chart, and ROC-AUC curves
+### Data Format
+Tab-separated values (TSV), no header row. Each line: `text<TAB>label`
+Binary example:
+```
+The bomb exploded near the market	1
+It was a sunny day at the park	0
+```
+Multiclass example (integer labels starting from 0):
+```
+The president signed the peace treaty	0
+Militants attacked the military base	1
+Thousands marched in the capital	2
+Aid workers delivered food supplies	3
+```
+### CSV Batch Processing
+Prepare a CSV with a `text` column:
+```csv
+text
+"The soldiers advanced toward the border."
+"The festival attracted thousands of visitors."
+```
+Upload it in the Batch Processing section of any inference tab.
+## Project Structure
+```
+conflibert-gui/
+  app.py                 # Main application
+  requirements.txt       # Dependencies
+  README.md
+  screenshots/           # UI screenshots for documentation
+  examples/
+    binary/              # Example binary dataset (conflict vs non-conflict)
+      train.tsv
+      dev.tsv
+      test.tsv
+    multiclass/          # Example multiclass dataset (4 event types)
+      train.tsv          #   0=Diplomacy, 1=Armed Conflict,
+      dev.tsv            #   2=Protest, 3=Humanitarian
+      test.tsv
+```
+## Training Features
+- Early stopping with configurable patience
+- Learning rate schedulers: linear, cosine, constant, constant with warmup
+- Mixed precision training (FP16) on CUDA GPUs
+- Gradient accumulation for larger effective batch sizes
+- Weight decay regularization
+- Automatic system detection (NVIDIA GPU, Apple Silicon MPS, CPU)
+- Model comparison with grouped bar charts and ROC-AUC curves
+## Citation
+If you use ConfliBERT in your research, please cite:
+Brandt, P.T., Alsarra, S., D'Orazio, V., Heintze, D., Khan, L., Meher, S., Osorio, J. and Sianan, M., 2025. Extractive versus Generative Language Models for Political Conflict Text Classification. *Political Analysis*, pp.1-29.
+```bibtex
+@article{brandt2025extractive,
+  title={Extractive versus Generative Language Models for Political Conflict Text Classification},
+  author={Brandt, Patrick T and Alsarra, Sultan and D'Orazio, Vito and Heintze, Dagmar and Khan, Latifur and Meher, Shreyas and Osorio, Javier and Sianan, Marcus},
+  journal={Political Analysis},
+  pages={1--29},
+  year={2025},
+  publisher={Cambridge University Press}
+}
+```
+## License
+MIT License. See LICENSE for details.

app.py CHANGED Viewed

@@ -1,878 +1,1629 @@
-import torch
-import tensorflow as tf
-from tf_keras import models, layers
-from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForTokenClassification, TFAutoModelForQuestionAnswering
-import gradio as gr
-import re
-import pandas as pd
-import io
-import os
-os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
-import keras
-# Check if GPU is available and use it if possible
-device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
-MAX_TOKEN_LENGTH = 512  # Adjust based on your model's limits
-def truncate_text(text, tokenizer, max_length=MAX_TOKEN_LENGTH):
-    """Truncate text to max token length"""
-    tokens = tokenizer.encode(text, truncation=False)
-    if len(tokens) > max_length:
-        tokens = tokens[:max_length-1] + [tokenizer.sep_token_id]
-        return tokenizer.decode(tokens, skip_special_tokens=True)
-    return text
-def safe_process(func, text, tokenizer):
-    """Safely process text with proper error handling"""
-    try:
-        truncated_text = truncate_text(text, tokenizer)
-        return func(truncated_text)
-    except Exception as e:
-        error_msg = str(e)
-        if 'out of memory' in error_msg.lower():
-            return "Error: Text too long for processing"
-        elif 'cuda' in error_msg.lower():
-            return "Error: GPU processing error"
-        else:
-            return f"Error: {error_msg}"
-# Load the models and tokenizers
-qa_model_name = 'salsarra/ConfliBERT-QA'
-qa_model = TFAutoModelForQuestionAnswering.from_pretrained(qa_model_name)
-qa_tokenizer = AutoTokenizer.from_pretrained(qa_model_name)
-ner_model_name = 'eventdata-utd/conflibert-named-entity-recognition'
-ner_model = AutoModelForTokenClassification.from_pretrained(ner_model_name).to(device)
-ner_tokenizer = AutoTokenizer.from_pretrained(ner_model_name)
-clf_model_name = 'eventdata-utd/conflibert-binary-classification'
-clf_model = AutoModelForSequenceClassification.from_pretrained(clf_model_name).to(device)
-clf_tokenizer = AutoTokenizer.from_pretrained(clf_model_name)
-multi_clf_model_name = 'eventdata-utd/conflibert-satp-relevant-multilabel'
-multi_clf_model = AutoModelForSequenceClassification.from_pretrained(multi_clf_model_name).to(device)
-multi_clf_tokenizer = AutoTokenizer.from_pretrained(multi_clf_model_name)
-# Define the class names for text classification
-class_names = ['Negative', 'Positive']
-multi_class_names = ["Armed Assault", "Bombing or Explosion", "Kidnapping", "Other"]  # Updated labels
-# Define the NER labels and colors
-ner_labels = {
-    'Organisation': 'blue',
-    'Person': 'red',
-    'Location': 'green',
-    'Quantity': 'orange',
-    'Weapon': 'purple',
-    'Nationality': 'cyan',
-    'Temporal': 'magenta',
-    'DocumentReference': 'brown',
-    'MilitaryPlatform': 'yellow',
-    'Money': 'pink'
-}
-def handle_error_message(e, default_limit=512):
-    error_message = str(e)
-    pattern = re.compile(r"The size of tensor a \((\d+)\) must match the size of tensor b \((\d+)\)")
-    match = pattern.search(error_message)
-    if match:
-        number_1, number_2 = match.groups()
-        return f"<span style='color: red; font-weight: bold;'>Error: Text Input is over limit where inserted text size {number_1} is larger than model limits of {number_2}</span>"
-    pattern_qa = re.compile(r"indices\[0,(\d+)\] = \d+ is not in \[0, (\d+)\)")
-    match_qa = pattern_qa.search(error_message)
-    if match_qa:
-        number_1, number_2 = match_qa.groups()
-        return f"<span style='color: red; font-weight: bold;'>Error: Text Input is over limit where inserted text size {number_1} is larger than model limits of {number_2}</span>"
-    return f"<span style='color: red; font-weight: bold;'>Error: Text Input is over limit where inserted text size is larger than model limits of {default_limit}</span>"
-# Define the functions for each task
-def question_answering(context, question):
-    try:
-        inputs = qa_tokenizer(question, context, return_tensors='tf', truncation=True)
-        outputs = qa_model(inputs)
-        answer_start = tf.argmax(outputs.start_logits, axis=1).numpy()[0]
-        answer_end = tf.argmax(outputs.end_logits, axis=1).numpy()[0] + 1
-        answer = qa_tokenizer.convert_tokens_to_string(qa_tokenizer.convert_ids_to_tokens(inputs['input_ids'].numpy()[0][answer_start:answer_end]))
-        return f"<span style='color: green; font-weight: bold;'>{answer}</span>"
-    except Exception as e:
-        return handle_error_message(e)
-def replace_unk(tokens):
-    return [token.replace('[UNK]', "'") for token in tokens]
-def named_entity_recognition(text, output_format='html'):
-    """
-    Process text for named entity recognition.
-    output_format: 'html' for GUI display, 'csv' for CSV processing
-    """
-    try:
-        inputs = ner_tokenizer(text, return_tensors='pt', truncation=True)
-        with torch.no_grad():
-            outputs = ner_model(**inputs)
-        ner_results = outputs.logits.argmax(dim=2).squeeze().tolist()
-        tokens = ner_tokenizer.convert_ids_to_tokens(inputs['input_ids'].squeeze().tolist())
-        tokens = replace_unk(tokens)
-        entities = []
-        seen_labels = set()
-        current_entity = []
-        current_label = None
-        # Process tokens and group consecutive entities
-        for i in range(len(tokens)):
-            token = tokens[i]
-            label = ner_model.config.id2label[ner_results[i]].split('-')[-1]
-            # Handle subwords
-            if token.startswith('##'):
-                if entities:
-                    if output_format == 'html':
-                        entities[-1][0] += token[2:]
-                    elif current_entity:
-                        current_entity[-1] = current_entity[-1] + token[2:]
-            else:
-                # For CSV format, group consecutive tokens of same entity type
-                if output_format == 'csv':
-                    if label != 'O':
-                        if label == current_label:
-                            current_entity.append(token)
-                        else:
-                            if current_entity:
-                                entities.append([' '.join(current_entity), current_label])
-                            current_entity = [token]
-                            current_label = label
-                    else:
-                        if current_entity:
-                            entities.append([' '.join(current_entity), current_label])
-                            current_entity = []
-                            current_label = None
-                else:
-                    entities.append([token, label])
-                if label != 'O':
-                    seen_labels.add(label)
-        # Don't forget the last entity for CSV format
-        if output_format == 'csv' and current_entity:
-            entities.append([' '.join(current_entity), current_label])
-        if output_format == 'csv':
-            # Group by entity type
-            grouped_entities = {}
-            for token, label in entities:
-                if label != 'O':
-                    if label not in grouped_entities:
-                        grouped_entities[label] = []
-                    grouped_entities[label].append(token)
-            # Format the output
-            result_parts = []
-            for label, tokens in grouped_entities.items():
-                unique_tokens = list(dict.fromkeys(tokens))  # Remove duplicates
-                result_parts.append(f"{label}: {' | '.join(unique_tokens)}")
-            return ' || '.join(result_parts)
-        else:
-            # Original HTML output
-            highlighted_text = ""
-            for token, label in entities:
-                color = ner_labels.get(label, 'black')
-                if label != 'O':
-                    highlighted_text += f"<span style='color: {color}; font-weight: bold;'>{token}</span> "
-                else:
-                    highlighted_text += f"{token} "
-            legend = "<div><strong>NER Tags Found:</strong><ul style='list-style-type: disc; padding-left: 20px;'>"
-            for label in seen_labels:
-                color = ner_labels.get(label, 'black')
-                legend += f"<li style='color: {color}; font-weight: bold;'>{label}</li>"
-            legend += "</ul></div>"
-            return f"<div>{highlighted_text}</div>{legend}"
-    except Exception as e:
-        return handle_error_message(e)
-def text_classification(text):
-    try:
-        inputs = clf_tokenizer(text, return_tensors='pt', truncation=True, padding=True).to(device)
-        with torch.no_grad():
-            outputs = clf_model(**inputs)
-        logits = outputs.logits.squeeze().tolist()
-        predicted_class = torch.argmax(outputs.logits, dim=1).item()
-        confidence = torch.softmax(outputs.logits, dim=1).max().item() * 100
-        if predicted_class == 1:  # Positive class
-            result = f"<span style='color: green; font-weight: bold;'>Positive: The text is related to conflict, violence, or politics. (Confidence: {confidence:.2f}%)</span>"
-        else:  # Negative class
-            result = f"<span style='color: red; font-weight: bold;'>Negative: The text is not related to conflict, violence, or politics. (Confidence: {confidence:.2f}%)</span>"
-        return result
-    except Exception as e:
-        return handle_error_message(e)
-def multilabel_classification(text):
-    try:
-        inputs = multi_clf_tokenizer(text, return_tensors='pt', truncation=True, padding=True).to(device)
-        with torch.no_grad():
-            outputs = multi_clf_model(**inputs)
-        predicted_classes = torch.sigmoid(outputs.logits).squeeze().tolist()
-        if len(predicted_classes) != len(multi_class_names):
-            return f"Error: Number of predicted classes ({len(predicted_classes)}) does not match number of class names ({len(multi_class_names)})."
-        results = []
-        for i in range(len(predicted_classes)):
-            confidence = predicted_classes[i] * 100
-            if predicted_classes[i] >= 0.5:
-                results.append(f"<span style='color: green; font-weight: bold;'>{multi_class_names[i]} (Confidence: {confidence:.2f}%)</span>")
-            else:
-                results.append(f"<span style='color: red; font-weight: bold;'>{multi_class_names[i]} (Confidence: {confidence:.2f}%)</span>")
-        return " / ".join(results)
-    except Exception as e:
-        return handle_error_message(e)
-def clean_html_tags(text):
-    """Remove HTML tags and formatting from the output."""
-    # Remove HTML tags but keep the text content
-    clean_text = re.sub(r'<[^>]+>', '', text)
-    # Remove multiple spaces
-    clean_text = re.sub(r'\s+', ' ', clean_text)
-    # Remove [CLS] and [SEP] tokens
-    clean_text = re.sub(r'\[CLS\]|\[SEP\]', '', clean_text)
-    return clean_text.strip()
-def extract_ner_entities(html_output):
-    """Extract entities and their types from NER output using a simpler approach."""
-    # Map colors to entity types
-    color_to_type = {
-        'blue': 'Organisation',
-        'red': 'Person',
-        'green': 'Location',
-        'orange': 'Quantity',
-        'purple': 'Weapon',
-        'cyan': 'Nationality',
-        'magenta': 'Temporal',
-        'brown': 'DocumentReference',
-        'yellow': 'MilitaryPlatform',
-        'pink': 'Money'
-    }
-    # Find all colored spans
-    pattern = r"<span style='color: ([^']+)[^>]+>([^<]+)</span>"
-    matches = re.findall(pattern, html_output)
-    # Group by entity type
-    entities = {}
-    # Process each match
-    for color, text in matches:
-        if color in color_to_type:
-            entity_type = color_to_type[color]
-            if entity_type not in entities:
-                entities[entity_type] = []
-            # Clean and store the text
-            text = text.strip()
-            if text and not text.isspace():
-                entities[entity_type].append(text)
-    # Join consecutive words for each entity type
-    result_parts = []
-    for entity_type, words in entities.items():
-        # Join consecutive words
-        phrases = []
-        current_phrase = []
-        for word in words:
-            if word in [',', '/', ':', '-']:  # Skip punctuation
-                continue
-            if not current_phrase:
-                current_phrase.append(word)
-            else:
-                # If it's a continuation (e.g., part of a date or name)
-                if word.startswith(':') or word == 'of' or current_phrase[-1].endswith('/'):
-                    current_phrase.append(word)
-                else:
-                    # If it's a new entity
-                    phrases.append(' '.join(current_phrase))
-                    current_phrase = [word]
-        if current_phrase:
-            phrases.append(' '.join(current_phrase))
-        # Remove duplicates while preserving order
-        unique_phrases = []
-        seen = set()
-        for phrase in phrases:
-            clean_phrase = phrase.strip()
-            if clean_phrase and clean_phrase not in seen:
-                unique_phrases.append(clean_phrase)
-                seen.add(clean_phrase)
-        if unique_phrases:
-            result_parts.append(f"{entity_type}: {' | '.join(unique_phrases)}")
-    return ' || '.join(result_parts)
-def clean_classification_output(html_output):
-    """Extract classification results without HTML formatting."""
-    if "Positive" in html_output:
-        # Binary classification
-        match = re.search(r">(Positive|Negative).*?Confidence: ([\d.]+)%", html_output)
-        if match:
-            class_name, confidence = match.groups()
-            return f"{class_name} ({confidence}%)"
-    else:
-        # Multilabel classification
-        results = []
-        matches = re.finditer(r">([^<]+)\s*\(Confidence:\s*([\d.]+)%\)", html_output)
-        for match in matches:
-            class_name, confidence = match.groups()
-            if float(confidence) >= 50:  # Only include classes with confidence >= 50%
-                results.append(f"{class_name.strip()} ({confidence}%)")
-        return " | ".join(results) if results else "No classes above 50% confidence"
-    return "Unknown"
-def process_csv_ner(file):
-    try:
-        df = pd.read_csv(file.name)
-        if 'text' not in df.columns:
-            return "Error: CSV must contain a 'text' column"
-        entities = []
-        for text in df['text']:
-            if pd.isna(text):
-                entities.append("")
-                continue
-            # Use CSV output format
-            result = named_entity_recognition(str(text), output_format='csv')
-            entities.append(result)
-        df['entities'] = entities
-        output_path = "processed_results.csv"
-        df.to_csv(output_path, index=False)
-        return output_path
-    except Exception as e:
-        return f"Error processing CSV: {str(e)}"
-def process_csv_classification(file, is_multi=False):
-    try:
-        df = pd.read_csv(file.name)
-        if 'text' not in df.columns:
-            return "Error: CSV must contain a 'text' column"
-        results = []
-        for text in df['text']:
-            if pd.isna(text):
-                results.append("")
-                continue
-            if is_multi:
-                html_result = multilabel_classification(str(text))
-            else:
-                html_result = text_classification(str(text))
-            results.append(clean_classification_output(html_result))
-        result_column = 'multilabel_results' if is_multi else 'classification_results'
-        df[result_column] = results
-        output_path = "processed_results.csv"
-        df.to_csv(output_path, index=False)
-        return output_path
-    except Exception as e:
-        return f"Error processing CSV: {str(e)}"
-# Define the Gradio interface
-def chatbot(task, text=None, context=None, question=None, file=None):
-    if file is not None:  # Handle CSV file input
-        if task == "Named Entity Recognition":
-            return process_csv_ner(file)
-        elif task == "Text Classification":
-            return process_csv_classification(file, is_multi=False)
-        elif task == "Multilabel Classification":
-            return process_csv_classification(file, is_multi=True)
-        else:
-            return "CSV processing is not supported for Question Answering task"
-    # Handle regular text input (previous implementation)
-    if task == "Question Answering":
-        if context and question:
-            return question_answering(context, question)
-        else:
-            return "Please provide both context and question for the Question Answering task."
-    elif task == "Named Entity Recognition":
-        if text:
-            return named_entity_recognition(text)
-        else:
-            return "Please provide text for the Named Entity Recognition task."
-    elif task == "Text Classification":
-        if text:
-            return text_classification(text)
-        else:
-            return "Please provide text for the Text Classification task."
-    elif task == "Multilabel Classification":
-        if text:
-            return multilabel_classification(text)
-        else:
-            return "Please provide text for the Multilabel Classification task."
-    else:
-        return "Please select a valid task."
-# Custom CSS for modern orange theme
-custom_css = """
-/* CSS Variables for Light and Dark Theme */
-:root {
-    --primary-orange: #ff6b35;
-    --primary-orange-light: #ff8c5a;
-    --primary-orange-dark: #e55a2b;
-    --secondary-orange: #ffa366;
-    --accent-orange: #ff9f40;
-    --background-light: #fefefe;
-    --background-dark: #1a1a1a;
-    --surface-light: #ffffff;
-    --surface-dark: #2d2d2d;
-    --text-primary-light: #2c2c2c;
-    --text-primary-dark: #ffffff;
-    --text-secondary-light: #666666;
-    --text-secondary-dark: #cccccc;
-    --border-light: #e0e0e0;
-    --border-dark: #404040;
-    --shadow-light: rgba(0, 0, 0, 0.1);
-    --shadow-dark: rgba(0, 0, 0, 0.3);
-    --gradient-orange: linear-gradient(135deg, #ff6b35 0%, #ff9f40 100%);
-    --gradient-orange-subtle: linear-gradient(135deg, rgba(255, 107, 53, 0.1) 0%, rgba(255, 159, 64, 0.1) 100%);
-}
-/* Dark theme overrides */
-.dark {
-    --background: var(--background-dark);
-    --surface: var(--surface-dark);
-    --text-primary: var(--text-primary-dark);
-    --text-secondary: var(--text-secondary-dark);
-    --border: var(--border-dark);
-    --shadow: var(--shadow-dark);
-}
-/* Light theme (default) */
-.light, :root {
-    --background: var(--background-light);
-    --surface: var(--surface-light);
-    --text-primary: var(--text-primary-light);
-    --text-secondary: var(--text-secondary-light);
-    --border: var(--border-light);
-    --shadow: var(--shadow-light);
-}
-/* Global Styles */
-* {
-    font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
-}
-/* Main Container */
-.gradio-container {
-    background: var(--background) !important;
-    color: var(--text-primary) !important;
-    min-height: 100vh;
-}
-/* Header Styling */
-.header-container {
-    background: var(--gradient-orange) !important;
-    padding: 2rem 1rem !important;
-    margin: -1rem -1rem 2rem -1rem !important;
-    border-radius: 0 0 24px 24px !important;
-    box-shadow: 0 8px 32px var(--shadow) !important;
-    position: relative;
-    overflow: hidden;
-}
-.header-container::before {
-    content: '';
-    position: absolute;
-    top: 0;
-    left: 0;
-    right: 0;
-    bottom: 0;
-    background: url("data:image/svg+xml,%3Csvg width='60' height='60' viewBox='0 0 60 60' xmlns='http://www.w3.org/2000/svg'%3E%3Cg fill='none' fill-rule='evenodd'%3E%3Cg fill='%23ffffff' fill-opacity='0.05'%3E%3Ccircle cx='30' cy='30' r='2'/%3E%3C/g%3E%3C/g%3E%3C/svg%3E") !important;
-    pointer-events: none;
-}
-.header-title-center {
-    text-align: center !important;
-    position: relative;
-    z-index: 1;
-}
-.header-title-center a {
-    color: white !important;
-    text-decoration: none !important;
-    font-weight: 900 !important;
-    font-size: 4rem !important;
-    text-shadow: 0 4px 8px rgba(0, 0, 0, 0.2) !important;
-    letter-spacing: -0.02em !important;
-    transition: all 0.3s ease !important;
-}
-.header-title-center a:hover {
-    transform: translateY(-2px) !important;
-    text-shadow: 0 6px 16px rgba(0, 0, 0, 0.3) !important;
-}
-/* Task Container */
-.task-container {
-    background: var(--surface) !important;
-    border-radius: 16px !important;
-    padding: 2rem !important;
-    box-shadow: 0 4px 24px var(--shadow) !important;
-    border: 1px solid var(--border) !important;
-    margin-bottom: 2rem !important;
-}
-/* Input Components */
-.input-text textarea, .input-text input {
-    background: var(--surface) !important;
-    border: 2px solid var(--border) !important;
-    border-radius: 12px !important;
-    padding: 1rem !important;
-    color: var(--text-primary) !important;
-    font-size: 0.95rem !important;
-    line-height: 1.5 !important;
-    transition: all 0.3s ease !important;
-    box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05) !important;
-}
-.input-text textarea:focus, .input-text input:focus {
-    border-color: var(--primary-orange) !important;
-    box-shadow: 0 0 0 3px rgba(255, 107, 53, 0.1) !important;
-    outline: none !important;
-    transform: translateY(-1px) !important;
-}
-/* Placeholder text styling */
-.input-text textarea::placeholder, .input-text input::placeholder {
-    color: var(--text-secondary) !important;
-    opacity: 0.7 !important;
-}
-.input-text textarea::-webkit-input-placeholder, .input-text input::-webkit-input-placeholder {
-    color: var(--text-secondary) !important;
-    opacity: 0.7 !important;
-}
-.input-text textarea::-moz-placeholder, .input-text input::-moz-placeholder {
-    color: var(--text-secondary) !important;
-    opacity: 0.7 !important;
-}
-.input-text textarea:-ms-input-placeholder, .input-text input:-ms-input-placeholder {
-    color: var(--text-secondary) !important;
-    opacity: 0.7 !important;
-}
-/* Dropdown Styling */
-.gr-dropdown {
-    background: var(--surface) !important;
-    border: 2px solid var(--border) !important;
-    border-radius: 12px !important;
-    color: var(--text-primary) !important;
-    transition: all 0.3s ease !important;
-}
-.gr-dropdown:focus-within {
-    border-color: var(--primary-orange) !important;
-    box-shadow: 0 0 0 3px rgba(255, 107, 53, 0.1) !important;
-}
-/* Button Styling */
-.submit-btn {
-    background: var(--gradient-orange) !important;
-    border: none !important;
-    border-radius: 12px !important;
-    padding: 1rem 2rem !important;
-    color: white !important;
-    font-weight: 600 !important;
-    font-size: 1rem !important;
-    cursor: pointer !important;
-    transition: all 0.3s ease !important;
-    box-shadow: 0 4px 16px rgba(255, 107, 53, 0.3) !important;
-    text-transform: uppercase !important;
-    letter-spacing: 0.5px !important;
-}
-.submit-btn:hover {
-    transform: translateY(-2px) !important;
-    box-shadow: 0 6px 24px rgba(255, 107, 53, 0.4) !important;
-    background: linear-gradient(135deg, #ff8c5a 0%, #ffb366 100%) !important;
-}
-.submit-btn:active {
-    transform: translateY(0) !important;
-    box-shadow: 0 2px 8px rgba(255, 107, 53, 0.3) !important;
-}
-/* File Upload Styling */
-.file-upload {
-    background: var(--gradient-orange-subtle) !important;
-    border: 2px dashed var(--primary-orange) !important;
-    border-radius: 12px !important;
-    padding: 1.5rem !important;
-    text-align: center !important;
-    transition: all 0.3s ease !important;
-}
-.file-upload:hover {
-    background: rgba(255, 107, 53, 0.15) !important;
-    border-color: var(--primary-orange-dark) !important;
-}
-/* Output Styling */
-.output-html {
-    background: var(--surface) !important;
-    border: 1px solid var(--border) !important;
-    border-radius: 12px !important;
-    padding: 1.5rem !important;
-    margin-top: 1rem !important;
-    box-shadow: 0 2px 12px var(--shadow) !important;
-    min-height: 100px !important;
-}
-.output-html div {
-    color: var(--text-primary) !important;
-    line-height: 1.6 !important;
-}
-/* Labels */
-label {
-    color: var(--text-primary) !important;
-    font-weight: 600 !important;
-    font-size: 0.9rem !important;
-    margin-bottom: 0.5rem !important;
-    text-transform: uppercase !important;
-    letter-spacing: 0.5px !important;
-}
-/* Footer */
-.footer {
-    background: var(--surface) !important;
-    border-top: 1px solid var(--border) !important;
-    padding: 1.5rem !important;
-    margin-top: 2rem !important;
-    text-align: center !important;
-    border-radius: 16px 16px 0 0 !important;
-}
-.footer a {
-    color: var(--primary-orange) !important;
-    text-decoration: none !important;
-    font-weight: 500 !important;
-    transition: color 0.3s ease !important;
-}
-.footer a:hover {
-    color: var(--primary-orange-dark) !important;
-    text-decoration: underline !important;
-}
-/* Responsive Design */
-@media (max-width: 768px) {
-    .header-title-center a {
-        font-size: 2.5rem !important;
-    }
-    .task-container {
-        padding: 1.5rem !important;
-        margin: 1rem !important;
-    }
-    .header-container {
-        padding: 1.5rem 1rem !important;
-        margin: -1rem -1rem 1rem -1rem !important;
-    }
-}
-/* Enhanced NER Output Styling */
-.output-html span[style*="color: blue"] { color: #3b82f6 !important; }
-.output-html span[style*="color: red"] { color: #ef4444 !important; }
-.output-html span[style*="color: green"] { color: #10b981 !important; }
-.output-html span[style*="color: orange"] { color: var(--primary-orange) !important; }
-.output-html span[style*="color: purple"] { color: #8b5cf6 !important; }
-.output-html span[style*="color: cyan"] { color: #06b6d4 !important; }
-.output-html span[style*="color: magenta"] { color: #ec4899 !important; }
-.output-html span[style*="color: brown"] { color: #92400e !important; }
-.output-html span[style*="color: yellow"] { color: #f59e0b !important; }
-.output-html span[style*="color: pink"] { color: #f472b6 !important; }
-/* Dark mode specific adjustments */
-@media (prefers-color-scheme: dark) {
-    .gradio-container {
-        background: var(--background-dark) !important;
-        color: var(--text-primary-dark) !important;
-    }
-    .task-container, .output-html {
-        background: var(--surface-dark) !important;
-        border-color: var(--border-dark) !important;
-    }
-    .input-text textarea, .input-text input, .gr-dropdown {
-        background: var(--surface-dark) !important;
-        border-color: var(--border-dark) !important;
-        color: var(--text-primary-dark) !important;
-    }
-    label {
-        color: var(--text-primary-dark) !important;
-    }
-}
-/* Smooth transitions for theme switching */
-* {
-    transition: background-color 0.3s ease, border-color 0.3s ease, color 0.3s ease !important;
-}
-"""
-with gr.Blocks(theme="allenai/gradio-theme", css=custom_css) as demo:
-    with gr.Column():
-        with gr.Row(elem_id="header", elem_classes="header-container"):
-            gr.Markdown("<div class='header-title-center'><a href='https://eventdata.utdallas.edu/conflibert/' style='font-size: 4rem; font-weight: 900;'>ConfliBERT</a></div>")
-        with gr.Column(elem_classes="task-container"):
-            gr.Markdown("<h2 style='font-size: 1.25rem; font-weight: 600; margin-bottom: 1.5rem;'>Select a task and provide the necessary inputs:</h2>")
-            task = gr.Dropdown(
-                choices=["Question Answering", "Named Entity Recognition", "Text Classification", "Multilabel Classification"],
-                label="Select Task",
-                value="Named Entity Recognition"
-            )
-            with gr.Row():
-                text_input = gr.Textbox(
-                    lines=5,
-                    placeholder="Enter the text here...",
-                    label="Text",
-                    elem_classes="input-text"
-                )
-                context_input = gr.Textbox(
-                    lines=5,
-                    placeholder="Enter the context here...",
-                    label="Context",
-                    visible=False,
-                    elem_classes="input-text"
-                )
-                question_input = gr.Textbox(
-                    lines=2,
-                    placeholder="Enter your question here...",
-                    label="Question",
-                    visible=False,
-                    elem_classes="input-text"
-                )
-            with gr.Row():
-                file_input = gr.File(
-                    label="Or upload a CSV file (must contain a 'text' column)",
-                    file_types=[".csv"],
-                    elem_classes="file-upload"
-                )
-                file_output = gr.File(
-                    label="Download processed results",
-                    visible=False,
-                    elem_classes="file-download"
-                )
-            with gr.Row():
-                submit_button = gr.Button(
-                    "Submit",
-                    elem_id="submit-button",
-                    elem_classes="submit-btn"
-                )
-            output = gr.HTML(label="Output", elem_classes="output-html")
-    with gr.Row(elem_classes="footer"):
-        gr.Markdown("<a href='https://eventdata.utdallas.edu/'>UTD Event Data</a> | <a href='https://www.utdallas.edu/'>University of Texas at Dallas</a>")
-        gr.Markdown("Developed By: <a href='https://www.linkedin.com/in/sultan-alsarra-phd-56977a63/' target='_blank'>Sultan Alsarra</a> and <a href='http://shreyasmeher.com' target='_blank'>Shreyas Meher</a>")
-    def update_inputs(task_name):
-        """Updates the visibility of input components based on the selected task."""
-        if task_name == "Question Answering":
-            return [
-                gr.update(visible=False),
-                gr.update(visible=True),
-                gr.update(visible=True),
-                gr.update(visible=False),
-                gr.update(visible=False)
-            ]
-        else:
-            return [
-                gr.update(visible=True),
-                gr.update(visible=False),
-                gr.update(visible=False),
-                gr.update(visible=True),
-                gr.update(visible=True)
-            ]
-    def chatbot_interface(task, text, context, question, file):
-        """Handles both file and text inputs for different tasks."""
-        if file:
-            result = chatbot(task, file=file)
-            if isinstance(result, str) and result.endswith('.csv'):
-                return gr.update(visible=False), gr.update(value=result, visible=True)
-            return gr.update(value=result, visible=True), gr.update(visible=False)
-        else:
-            result = chatbot(task, text, context, question)
-            return gr.update(value=result, visible=True), gr.update(visible=False)
-    def chatbot(task, text=None, context=None, question=None, file=None):
-        """Main function to process different types of inputs and tasks."""
-        if file is not None:  # Handle CSV file input
-            if task == "Named Entity Recognition":
-                return process_csv_ner(file)
-            elif task == "Text Classification":
-                return process_csv_classification(file, is_multi=False)
-            elif task == "Multilabel Classification":
-                return process_csv_classification(file, is_multi=True)
-            else:
-                return "CSV processing is not supported for Question Answering task"
-        # Handle regular text input
-        if task == "Question Answering":
-            if context and question:
-                return question_answering(context, question)
-            else:
-                return "Please provide both context and question for the Question Answering task."
-        elif task == "Named Entity Recognition":
-            if text:
-                return named_entity_recognition(text)
-            else:
-                return "Please provide text for the Named Entity Recognition task."
-        elif task == "Text Classification":
-            if text:
-                return text_classification(text)
-            else:
-                return "Please provide text for the Text Classification task."
-        elif task == "Multilabel Classification":
-            if text:
-                return multilabel_classification(text)
-            else:
-                return "Please provide text for the Multilabel Classification task."
-        else:
-            return "Please select a valid task."
-    task.change(fn=update_inputs, inputs=task, outputs=[text_input, context_input, question_input, file_input, file_output])
-    submit_button.click(
-        fn=chatbot_interface,
-        inputs=[task, text_input, context_input, question_input, file_input],
-        outputs=[output, file_output]
-    )
-demo.launch(share=True)

+# ============================================================================
+# ConfliBERT - Conflict & Political Violence NLP Toolkit
+# University of Texas at Dallas | Event Data Lab
+# ============================================================================
+import os
+os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
+import torch
+import tensorflow as tf
+import tf_keras  # noqa: F401 - needed for TF model loading
+import keras     # noqa: F401 - needed for TF model loading
+from transformers import (
+    AutoTokenizer,
+    AutoModelForSequenceClassification,
+    AutoModelForTokenClassification,
+    TFAutoModelForQuestionAnswering,
+    TrainingArguments,
+    Trainer,
+    EarlyStoppingCallback,
+    TrainerCallback,
+)
+import gradio as gr
+import numpy as np
+import pandas as pd
+import re
+import csv
+import tempfile
+from sklearn.metrics import (
+    accuracy_score as sk_accuracy,
+    precision_score as sk_precision,
+    recall_score as sk_recall,
+    f1_score as sk_f1,
+    roc_curve,
+    auc as sk_auc,
+)
+from sklearn.preprocessing import label_binarize
+from torch.utils.data import Dataset as TorchDataset
+import gc
+# ============================================================================
+# CONFIGURATION
+# ============================================================================
+if torch.cuda.is_available():
+    device = torch.device('cuda')
+elif hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
+    device = torch.device('mps')
+else:
+    device = torch.device('cpu')
+MAX_TOKEN_LENGTH = 512
+def get_system_info():
+    """Build an HTML string describing the user's compute environment."""
+    import platform
+    lines = []
+    # Device
+    if device.type == 'cuda':
+        gpu_name = torch.cuda.get_device_name(0)
+        vram = torch.cuda.get_device_properties(0).total_mem / (1024 ** 3)
+        lines.append(f"GPU: {gpu_name} ({vram:.1f} GB VRAM)")
+        lines.append("FP16 training: supported")
+    elif device.type == 'mps':
+        lines.append("GPU: Apple Silicon (MPS)")
+        lines.append("FP16 training: not supported on MPS")
+    else:
+        lines.append("GPU: None detected (using CPU)")
+        lines.append("FP16 training: not supported on CPU")
+    # CPU / RAM
+    import os
+    cpu_count = os.cpu_count() or 1
+    lines.append(f"CPU cores: {cpu_count}")
+    try:
+        import psutil
+        ram_gb = psutil.virtual_memory().total / (1024 ** 3)
+        lines.append(f"RAM: {ram_gb:.1f} GB")
+    except ImportError:
+        pass
+    lines.append(f"Platform: {platform.system()} {platform.machine()}")
+    lines.append(f"PyTorch: {torch.__version__}")
+    return " · ".join(lines)
+FINETUNE_MODELS = {
+    "ConfliBERT (recommended for conflict/political text)": "snowood1/ConfliBERT-scr-uncased",
+    "BERT Base Uncased": "bert-base-uncased",
+    "BERT Base Cased": "bert-base-cased",
+    "RoBERTa Base": "roberta-base",
+    "ModernBERT Base": "answerdotai/ModernBERT-base",
+    "DeBERTa v3 Base": "microsoft/deberta-v3-base",
+    "DistilBERT Base Uncased": "distilbert-base-uncased",
+}
+NER_LABELS = {
+    'Organisation': '#3b82f6',
+    'Person': '#ef4444',
+    'Location': '#10b981',
+    'Quantity': '#ff6b35',
+    'Weapon': '#8b5cf6',
+    'Nationality': '#06b6d4',
+    'Temporal': '#ec4899',
+    'DocumentReference': '#92400e',
+    'MilitaryPlatform': '#f59e0b',
+    'Money': '#f472b6',
+}
+CLASS_NAMES = ['Negative', 'Positive']
+MULTI_CLASS_NAMES = ["Armed Assault", "Bombing or Explosion", "Kidnapping", "Other"]
+# ============================================================================
+# PRETRAINED MODEL LOADING
+# ============================================================================
+qa_model_name = 'salsarra/ConfliBERT-QA'
+qa_model = TFAutoModelForQuestionAnswering.from_pretrained(qa_model_name)
+qa_tokenizer = AutoTokenizer.from_pretrained(qa_model_name)
+ner_model_name = 'eventdata-utd/conflibert-named-entity-recognition'
+ner_model = AutoModelForTokenClassification.from_pretrained(ner_model_name).to(device)
+ner_tokenizer = AutoTokenizer.from_pretrained(ner_model_name)
+clf_model_name = 'eventdata-utd/conflibert-binary-classification'
+clf_model = AutoModelForSequenceClassification.from_pretrained(clf_model_name).to(device)
+clf_tokenizer = AutoTokenizer.from_pretrained(clf_model_name)
+multi_clf_model_name = 'eventdata-utd/conflibert-satp-relevant-multilabel'
+multi_clf_model = AutoModelForSequenceClassification.from_pretrained(multi_clf_model_name).to(device)
+multi_clf_tokenizer = AutoTokenizer.from_pretrained(multi_clf_model_name)
+# ============================================================================
+# UTILITY FUNCTIONS
+# ============================================================================
+def get_path(f):
+    """Get file path from Gradio file component output."""
+    if f is None:
+        return None
+    return f if isinstance(f, str) else getattr(f, 'name', str(f))
+def truncate_text(text, tokenizer, max_length=MAX_TOKEN_LENGTH):
+    tokens = tokenizer.encode(text, truncation=False)
+    if len(tokens) > max_length:
+        tokens = tokens[:max_length - 1] + [tokenizer.sep_token_id]
+        return tokenizer.decode(tokens, skip_special_tokens=True)
+    return text
+def info_callout(text):
+    """Wrap markdown text in a styled callout div to avoid Gradio double-border."""
+    return (
+        "<div class='info-callout-inner' style='"
+        "background: #fff7f3; border-left: 3px solid #ff6b35; "
+        "padding: 0.75rem 1rem; border-radius: 0 8px 8px 0; "
+        "font-size: 0.9rem;'>\n\n"
+        f"{text}\n\n</div>"
+    )
+def handle_error(e, default_limit=512):
+    msg = str(e)
+    match = re.search(
+        r"The size of tensor a \((\d+)\) must match the size of tensor b \((\d+)\)", msg
+    )
+    if match:
+        return (
+            f"<span style='color: #ef4444; font-weight: 600;'>"
+            f"Error: Input ({match.group(1)} tokens) exceeds model limit ({match.group(2)})</span>"
+        )
+    match_qa = re.search(r"indices\[0,(\d+)\] = \d+ is not in \[0, (\d+)\)", msg)
+    if match_qa:
+        return (
+            f"<span style='color: #ef4444; font-weight: 600;'>"
+            f"Error: Input too long for model (limit: {match_qa.group(2)} tokens)</span>"
+        )
+    return f"<span style='color: #ef4444; font-weight: 600;'>Error: {msg}</span>"
+# ============================================================================
+# INFERENCE FUNCTIONS
+# ============================================================================
+def question_answering(context, question):
+    if not context or not question:
+        return "Please provide both context and question."
+    try:
+        inputs = qa_tokenizer(question, context, return_tensors='tf', truncation=True)
+        outputs = qa_model(inputs)
+        start = tf.argmax(outputs.start_logits, axis=1).numpy()[0]
+        end = tf.argmax(outputs.end_logits, axis=1).numpy()[0] + 1
+        tokens = qa_tokenizer.convert_ids_to_tokens(
+            inputs['input_ids'].numpy()[0][start:end]
+        )
+        answer = qa_tokenizer.convert_tokens_to_string(tokens)
+        return f"<span style='color: #10b981; font-weight: 600;'>{answer}</span>"
+    except Exception as e:
+        return handle_error(e)
+def named_entity_recognition(text, output_format='html'):
+    if not text:
+        return "Please provide text for analysis."
+    try:
+        inputs = ner_tokenizer(text, return_tensors='pt', truncation=True)
+        with torch.no_grad():
+            outputs = ner_model(**inputs)
+        results = outputs.logits.argmax(dim=2).squeeze().tolist()
+        tokens = ner_tokenizer.convert_ids_to_tokens(inputs['input_ids'].squeeze().tolist())
+        tokens = [t.replace('[UNK]', "'") for t in tokens]
+        entities = []
+        seen_labels = set()
+        current_entity = []
+        current_label = None
+        for i in range(len(tokens)):
+            token = tokens[i]
+            label = ner_model.config.id2label[results[i]].split('-')[-1]
+            if token.startswith('##'):
+                if entities:
+                    if output_format == 'html':
+                        entities[-1][0] += token[2:]
+                    elif current_entity:
+                        current_entity[-1] = current_entity[-1] + token[2:]
+            else:
+                if output_format == 'csv':
+                    if label != 'O':
+                        if label == current_label:
+                            current_entity.append(token)
+                        else:
+                            if current_entity:
+                                entities.append([' '.join(current_entity), current_label])
+                            current_entity = [token]
+                            current_label = label
+                    else:
+                        if current_entity:
+                            entities.append([' '.join(current_entity), current_label])
+                            current_entity = []
+                            current_label = None
+                else:
+                    entities.append([token, label])
+                if label != 'O':
+                    seen_labels.add(label)
+        if output_format == 'csv' and current_entity:
+            entities.append([' '.join(current_entity), current_label])
+        if output_format == 'csv':
+            grouped = {}
+            for token, label in entities:
+                if label != 'O':
+                    grouped.setdefault(label, []).append(token)
+            parts = []
+            for label, toks in grouped.items():
+                unique = list(dict.fromkeys(toks))
+                parts.append(f"{label}: {' | '.join(unique)}")
+            return ' || '.join(parts)
+        # HTML output
+        highlighted = ""
+        for token, label in entities:
+            color = NER_LABELS.get(label, 'inherit')
+            if label != 'O':
+                highlighted += (
+                    f"<span style='color: {color}; font-weight: 600;'>{token}</span> "
+                )
+            else:
+                highlighted += f"{token} "
+        if seen_labels:
+            legend_items = ""
+            for label in sorted(seen_labels):
+                color = NER_LABELS.get(label, '#666')
+                legend_items += (
+                    f"<li style='color: {color}; font-weight: 600; "
+                    f"background: {color}15; padding: 2px 8px; border-radius: 4px; "
+                    f"font-size: 0.85rem;'>{label}</li>"
+                )
+            legend = (
+                f"<div style='margin-top: 1rem; padding-top: 0.75rem; "
+                f"border-top: 1px solid #e5e7eb;'>"
+                f"<strong>Entities found:</strong>"
+                f"<ul style='list-style: none; padding: 0; display: flex; "
+                f"flex-wrap: wrap; gap: 0.5rem; margin-top: 0.5rem;'>"
+                f"{legend_items}</ul></div>"
+            )
+            return f"<div style='line-height: 1.8;'>{highlighted}</div>{legend}"
+        else:
+            return (
+                f"<div style='line-height: 1.8;'>{highlighted}</div>"
+                f"<div style='color: #888; margin-top: 0.5rem;'>No entities detected.</div>"
+            )
+    except Exception as e:
+        return handle_error(e)
+def predict_with_model(text, model, tokenizer):
+    """Run inference with an arbitrary classification model."""
+    model.eval()
+    dev = next(model.parameters()).device
+    inputs = tokenizer(
+        text, return_tensors='pt', truncation=True, padding=True, max_length=512
+    )
+    inputs = {k: v.to(dev) for k, v in inputs.items()}
+    with torch.no_grad():
+        outputs = model(**inputs)
+    probs = torch.softmax(outputs.logits, dim=1).squeeze()
+    predicted = torch.argmax(probs).item()
+    num_classes = probs.shape[0] if probs.dim() > 0 else 1
+    lines = []
+    for i in range(num_classes):
+        p = probs[i].item() * 100 if probs.dim() > 0 else probs.item() * 100
+        if i == predicted:
+            lines.append(
+                f"<span style='color: #10b981; font-weight: 600;'>"
+                f"Class {i}: {p:.2f}% (predicted)</span>"
+            )
+        else:
+            lines.append(f"<span style='color: #9ca3af;'>Class {i}: {p:.2f}%</span>")
+    return "<br>".join(lines)
+def text_classification(text, custom_model=None, custom_tokenizer=None):
+    if not text:
+        return "Please provide text for classification."
+    try:
+        # Use custom model if loaded
+        if custom_model is not None and custom_tokenizer is not None:
+            return predict_with_model(text, custom_model, custom_tokenizer)
+        # Pretrained binary classifier
+        inputs = clf_tokenizer(
+            text, return_tensors='pt', truncation=True, padding=True
+        ).to(device)
+        with torch.no_grad():
+            outputs = clf_model(**inputs)
+        predicted = torch.argmax(outputs.logits, dim=1).item()
+        confidence = torch.softmax(outputs.logits, dim=1).max().item() * 100
+        if predicted == 1:
+            return (
+                f"<span style='color: #10b981; font-weight: 600;'>"
+                f"Positive -- Related to conflict, violence, or politics. "
+                f"(Confidence: {confidence:.1f}%)</span>"
+            )
+        else:
+            return (
+                f"<span style='color: #ef4444; font-weight: 600;'>"
+                f"Negative -- Not related to conflict, violence, or politics. "
+                f"(Confidence: {confidence:.1f}%)</span>"
+            )
+    except Exception as e:
+        return handle_error(e)
+def multilabel_classification(text):
+    if not text:
+        return "Please provide text for classification."
+    try:
+        inputs = multi_clf_tokenizer(
+            text, return_tensors='pt', truncation=True, padding=True
+        ).to(device)
+        with torch.no_grad():
+            outputs = multi_clf_model(**inputs)
+        probs = torch.sigmoid(outputs.logits).squeeze().tolist()
+        results = []
+        for i in range(len(probs)):
+            conf = probs[i] * 100
+            if probs[i] >= 0.5:
+                results.append(
+                    f"<span style='color: #10b981; font-weight: 600;'>"
+                    f"{MULTI_CLASS_NAMES[i]}: {conf:.1f}%</span>"
+                )
+            else:
+                results.append(
+                    f"<span style='color: #9ca3af;'>"
+                    f"{MULTI_CLASS_NAMES[i]}: {conf:.1f}%</span>"
+                )
+        return "<br>".join(results)
+    except Exception as e:
+        return handle_error(e)
+# ============================================================================
+# CSV BATCH PROCESSING
+# ============================================================================
+def process_csv_ner(file):
+    path = get_path(file)
+    if path is None:
+        return None
+    df = pd.read_csv(path)
+    if 'text' not in df.columns:
+        raise ValueError("CSV must contain a 'text' column")
+    entities = []
+    for text in df['text']:
+        if pd.isna(text):
+            entities.append("")
+        else:
+            entities.append(named_entity_recognition(str(text), output_format='csv'))
+    df['entities'] = entities
+    out = tempfile.NamedTemporaryFile(suffix='_ner_results.csv', delete=False)
+    df.to_csv(out.name, index=False)
+    return out.name
+def process_csv_binary(file, custom_model=None, custom_tokenizer=None):
+    path = get_path(file)
+    if path is None:
+        return None
+    df = pd.read_csv(path)
+    if 'text' not in df.columns:
+        raise ValueError("CSV must contain a 'text' column")
+    results = []
+    for text in df['text']:
+        if pd.isna(text):
+            results.append("")
+        else:
+            html = text_classification(str(text), custom_model, custom_tokenizer)
+            results.append(re.sub(r'<[^>]+>', '', html).strip())
+    df['classification_results'] = results
+    out = tempfile.NamedTemporaryFile(suffix='_classification_results.csv', delete=False)
+    df.to_csv(out.name, index=False)
+    return out.name
+def process_csv_multilabel(file):
+    path = get_path(file)
+    if path is None:
+        return None
+    df = pd.read_csv(path)
+    if 'text' not in df.columns:
+        raise ValueError("CSV must contain a 'text' column")
+    results = []
+    for text in df['text']:
+        if pd.isna(text):
+            results.append("")
+        else:
+            html = multilabel_classification(str(text))
+            results.append(re.sub(r'<[^>]+>', '', html).strip())
+    df['multilabel_results'] = results
+    out = tempfile.NamedTemporaryFile(suffix='_multilabel_results.csv', delete=False)
+    df.to_csv(out.name, index=False)
+    return out.name
+# ============================================================================
+# FINETUNING
+# ============================================================================
+class TextClassificationDataset(TorchDataset):
+    """PyTorch Dataset for text classification with HuggingFace tokenizers."""
+    def __init__(self, texts, labels, tokenizer, max_length=512):
+        self.encodings = tokenizer(
+            texts, truncation=True, padding=True,
+            max_length=max_length, return_tensors=None,
+        )
+        self.labels = labels
+    def __getitem__(self, idx):
+        item = {k: torch.tensor(v[idx]) for k, v in self.encodings.items()}
+        item['labels'] = torch.tensor(self.labels[idx], dtype=torch.long)
+        return item
+    def __len__(self):
+        return len(self.labels)
+def parse_data_file(file_path):
+    """Parse a TSV/CSV data file. Expected format: text<separator>label (no header).
+    Labels must be integers. Returns (texts, labels, num_labels)."""
+    path = get_path(file_path)
+    texts, labels = [], []
+    # Detect delimiter from first line
+    with open(path, 'r', encoding='utf-8') as f:
+        first_line = f.readline()
+    delimiter = '\t' if '\t' in first_line else ','
+    with open(path, 'r', encoding='utf-8') as f:
+        reader = csv.reader(f, delimiter=delimiter, quotechar='"')
+        for row in reader:
+            if len(row) < 2:
+                continue
+            try:
+                label = int(row[-1].strip())
+                text = row[0].strip() if len(row) == 2 else delimiter.join(row[:-1]).strip()
+                if text:
+                    texts.append(text)
+                    labels.append(label)
+            except (ValueError, IndexError):
+                continue  # skip header or malformed rows
+    if not texts:
+        raise ValueError(
+            "No valid data rows found. Expected format: text<tab>label (no header row)"
+        )
+    num_labels = max(labels) + 1
+    return texts, labels, num_labels
+class LogCallback(TrainerCallback):
+    """Captures training logs for display in the UI."""
+    def __init__(self):
+        self.entries = []
+    def on_log(self, args, state, control, logs=None, **kwargs):
+        if logs:
+            self.entries.append({**logs})
+    def format(self):
+        lines = []
+        skip_keys = {
+            'total_flos', 'train_runtime', 'train_samples_per_second',
+            'train_steps_per_second', 'train_loss',
+        }
+        for entry in self.entries:
+            parts = []
+            for k, v in sorted(entry.items()):
+                if k in skip_keys:
+                    continue
+                if isinstance(v, float):
+                    parts.append(f"{k}: {v:.4f}")
+                elif isinstance(v, (int, np.integer)):
+                    parts.append(f"{k}: {v}")
+            if parts:
+                lines.append("  ".join(parts))
+        return "\n".join(lines)
+def make_compute_metrics(task_type):
+    """Factory for compute_metrics function based on task type."""
+    def compute_metrics(eval_pred):
+        logits, labels = eval_pred
+        preds = np.argmax(logits, axis=-1)
+        acc = sk_accuracy(labels, preds)
+        if task_type == "Binary":
+            return {
+                'accuracy': acc,
+                'precision': sk_precision(labels, preds, zero_division=0),
+                'recall': sk_recall(labels, preds, zero_division=0),
+                'f1': sk_f1(labels, preds, zero_division=0),
+            }
+        else:
+            return {
+                'accuracy': acc,
+                'f1_macro': sk_f1(labels, preds, average='macro', zero_division=0),
+                'f1_micro': sk_f1(labels, preds, average='micro', zero_division=0),
+                'precision_macro': sk_precision(
+                    labels, preds, average='macro', zero_division=0
+                ),
+                'precision_micro': sk_precision(
+                    labels, preds, average='micro', zero_division=0
+                ),
+                'recall_macro': sk_recall(
+                    labels, preds, average='macro', zero_division=0
+                ),
+                'recall_micro': sk_recall(
+                    labels, preds, average='micro', zero_division=0
+                ),
+            }
+    return compute_metrics
+def run_finetuning(
+    train_file, dev_file, test_file, task_type, model_display_name,
+    epochs, batch_size, lr, weight_decay, warmup_ratio, max_seq_len,
+    grad_accum, fp16, patience, scheduler,
+    progress=gr.Progress(track_tqdm=True),
+):
+    """Main finetuning function. Returns logs, metrics, model state, and visibility updates."""
+    try:
+        # Validate inputs
+        if train_file is None or dev_file is None or test_file is None:
+            raise ValueError("Please upload all three data files (train, dev, test).")
+        epochs = int(epochs)
+        batch_size = int(batch_size)
+        max_seq_len = int(max_seq_len)
+        grad_accum = int(grad_accum)
+        patience = int(patience)
+        # Parse data files
+        train_texts, train_labels, n_train = parse_data_file(train_file)
+        dev_texts, dev_labels, n_dev = parse_data_file(dev_file)
+        test_texts, test_labels, n_test = parse_data_file(test_file)
+        num_labels = max(n_train, n_dev, n_test)
+        if task_type == "Binary" and num_labels > 2:
+            raise ValueError(
+                f"Binary task selected but found {num_labels} label classes in data. "
+                f"Use Multiclass instead."
+            )
+        if task_type == "Binary":
+            num_labels = 2
+        # Load model and tokenizer
+        model_id = FINETUNE_MODELS[model_display_name]
+        tokenizer = AutoTokenizer.from_pretrained(model_id)
+        model = AutoModelForSequenceClassification.from_pretrained(
+            model_id, num_labels=num_labels
+        )
+        # Create datasets
+        train_ds = TextClassificationDataset(
+            train_texts, train_labels, tokenizer, max_seq_len
+        )
+        dev_ds = TextClassificationDataset(
+            dev_texts, dev_labels, tokenizer, max_seq_len
+        )
+        test_ds = TextClassificationDataset(
+            test_texts, test_labels, tokenizer, max_seq_len
+        )
+        # Output directory
+        output_dir = tempfile.mkdtemp(prefix='conflibert_ft_')
+        # Training arguments
+        best_metric = 'f1' if task_type == 'Binary' else 'f1_macro'
+        training_args = TrainingArguments(
+            output_dir=output_dir,
+            num_train_epochs=epochs,
+            per_device_train_batch_size=batch_size,
+            per_device_eval_batch_size=batch_size * 2,
+            learning_rate=lr,
+            weight_decay=weight_decay,
+            warmup_ratio=warmup_ratio,
+            gradient_accumulation_steps=grad_accum,
+            fp16=fp16 and torch.cuda.is_available(),
+            eval_strategy='epoch',
+            save_strategy='epoch',
+            load_best_model_at_end=True,
+            metric_for_best_model=best_metric,
+            greater_is_better=True,
+            logging_steps=10,
+            save_total_limit=2,
+            lr_scheduler_type=scheduler,
+            report_to='none',
+            seed=42,
+        )
+        # Callbacks
+        log_callback = LogCallback()
+        callbacks = [log_callback]
+        if patience > 0:
+            callbacks.append(EarlyStoppingCallback(early_stopping_patience=patience))
+        # Create Trainer
+        trainer = Trainer(
+            model=model,
+            args=training_args,
+            train_dataset=train_ds,
+            eval_dataset=dev_ds,
+            compute_metrics=make_compute_metrics(task_type),
+            callbacks=callbacks,
+        )
+        # Train
+        train_result = trainer.train()
+        # Evaluate on test set
+        test_results = trainer.evaluate(test_ds, metric_key_prefix='test')
+        # Build log text
+        header = (
+            f"=== Configuration ===\n"
+            f"Model: {model_display_name}\n"
+            f"       {model_id}\n"
+            f"Task:  {task_type} Classification ({num_labels} classes)\n"
+            f"Data:  {len(train_texts)} train / {len(dev_texts)} dev / {len(test_texts)} test\n"
+            f"Epochs: {epochs}  Batch: {batch_size}  LR: {lr}  Scheduler: {scheduler}\n"
+            f"\n=== Training Log ===\n"
+        )
+        runtime = train_result.metrics.get('train_runtime', 0)
+        footer = (
+            f"\n=== Training Complete ===\n"
+            f"Time: {runtime:.1f}s ({runtime / 60:.1f} min)\n"
+        )
+        log_text = header + log_callback.format() + footer
+        # Build metrics DataFrame
+        metrics_data = []
+        for k, v in sorted(test_results.items()):
+            if isinstance(v, (int, float, np.floating, np.integer)) and k != 'test_epoch':
+                name = k.replace('test_', '').replace('_', ' ').title()
+                metrics_data.append([name, f"{float(v):.4f}"])
+        metrics_df = pd.DataFrame(metrics_data, columns=['Metric', 'Score'])
+        # Move trained model to CPU for inference
+        trained_model = trainer.model.cpu()
+        trained_model.eval()
+        return (
+            log_text, metrics_df, trained_model, tokenizer, num_labels,
+            gr.Column(visible=True), gr.Column(visible=True),
+        )
+    except Exception as e:
+        error_log = f"Training failed:\n{str(e)}"
+        empty_df = pd.DataFrame(columns=['Metric', 'Score'])
+        return (
+            error_log, empty_df, None, None, None,
+            gr.Column(visible=False), gr.Column(visible=False),
+        )
+# ============================================================================
+# MODEL MANAGEMENT (predict, save, load)
+# ============================================================================
+def predict_finetuned(text, model_state, tokenizer_state, num_labels_state):
+    """Run prediction with the finetuned model stored in gr.State."""
+    if not text:
+        return "Please enter some text."
+    if model_state is None:
+        return "No model available. Please train a model first."
+    return predict_with_model(text, model_state, tokenizer_state)
+def save_finetuned_model(save_path, model_state, tokenizer_state):
+    """Save the finetuned model and tokenizer to disk."""
+    if model_state is None:
+        return "No model to save. Please train a model first."
+    if not save_path:
+        return "Please specify a save directory."
+    try:
+        os.makedirs(save_path, exist_ok=True)
+        model_state.save_pretrained(save_path)
+        tokenizer_state.save_pretrained(save_path)
+        return f"Model saved successfully to: {save_path}"
+    except Exception as e:
+        return f"Error saving model: {str(e)}"
+def load_custom_model(path):
+    """Load a finetuned classification model from disk."""
+    if not path or not os.path.isdir(path):
+        return None, None, "Invalid path. Please enter a valid model directory."
+    try:
+        tokenizer = AutoTokenizer.from_pretrained(path)
+        model = AutoModelForSequenceClassification.from_pretrained(path)
+        model.eval()
+        n = model.config.num_labels
+        return model, tokenizer, f"Loaded model with {n} classes from: {path}"
+    except Exception as e:
+        return None, None, f"Error loading model: {str(e)}"
+def reset_custom_model():
+    """Reset to the pretrained ConfliBERT binary classifier."""
+    return None, None, "Reset to pretrained ConfliBERT binary classifier."
+def batch_predict_finetuned(file, model_state, tokenizer_state, num_labels_state):
+    """Run batch predictions on a CSV using the finetuned model."""
+    if model_state is None:
+        return None
+    path = get_path(file)
+    if path is None:
+        return None
+    df = pd.read_csv(path)
+    if 'text' not in df.columns:
+        raise ValueError("CSV must contain a 'text' column")
+    model_state.eval()
+    dev = next(model_state.parameters()).device
+    predictions, confidences = [], []
+    for text in df['text']:
+        if pd.isna(text):
+            predictions.append("")
+            confidences.append("")
+            continue
+        inputs = tokenizer_state(
+            str(text), return_tensors='pt', truncation=True,
+            padding=True, max_length=512,
+        )
+        inputs = {k: v.to(dev) for k, v in inputs.items()}
+        with torch.no_grad():
+            outputs = model_state(**inputs)
+        probs = torch.softmax(outputs.logits, dim=1).squeeze()
+        pred = torch.argmax(probs).item()
+        conf = probs[pred].item() * 100
+        predictions.append(str(pred))
+        confidences.append(f"{conf:.1f}%")
+    df['predicted_class'] = predictions
+    df['confidence'] = confidences
+    out = tempfile.NamedTemporaryFile(suffix='_predictions.csv', delete=False)
+    df.to_csv(out.name, index=False)
+    return out.name
+EXAMPLES_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "examples")
+def load_example_binary():
+    """Load the binary classification example dataset."""
+    return (
+        os.path.join(EXAMPLES_DIR, "binary", "train.tsv"),
+        os.path.join(EXAMPLES_DIR, "binary", "dev.tsv"),
+        os.path.join(EXAMPLES_DIR, "binary", "test.tsv"),
+        "Binary",
+    )
+def load_example_multiclass():
+    """Load the multiclass classification example dataset."""
+    return (
+        os.path.join(EXAMPLES_DIR, "multiclass", "train.tsv"),
+        os.path.join(EXAMPLES_DIR, "multiclass", "dev.tsv"),
+        os.path.join(EXAMPLES_DIR, "multiclass", "test.tsv"),
+        "Multiclass",
+    )
+def run_comparison(
+    train_file, dev_file, test_file, task_type, selected_models,
+    epochs, batch_size, lr,
+    progress=gr.Progress(track_tqdm=True),
+):
+    """Train multiple models on the same data and compare performance + ROC curves."""
+    import plotly.graph_objects as go
+    from plotly.subplots import make_subplots
+    empty = ("", None, None, None, gr.Column(visible=False))
+    try:
+        if not selected_models or len(selected_models) < 2:
+            return ("Select at least 2 models to compare.",) + empty[1:]
+        if train_file is None or dev_file is None or test_file is None:
+            return ("Upload all 3 data files first.",) + empty[1:]
+        epochs = int(epochs)
+        batch_size = int(batch_size)
+        train_texts, train_labels, n_train = parse_data_file(train_file)
+        dev_texts, dev_labels, n_dev = parse_data_file(dev_file)
+        test_texts, test_labels, n_test = parse_data_file(test_file)
+        num_labels = max(n_train, n_dev, n_test)
+        if task_type == "Binary":
+            num_labels = 2
+        # Only keep these metrics for the table and bar chart
+        if task_type == "Binary":
+            keep_metrics = {'Accuracy', 'Precision', 'Recall', 'F1'}
+        else:
+            keep_metrics = {
+                'Accuracy', 'F1 Macro', 'F1 Micro',
+                'Precision Macro', 'Recall Macro',
+            }
+        results = []
+        roc_data = {}  # model_name -> (true_labels, probabilities)
+        log_lines = []
+        for i, model_display_name in enumerate(selected_models):
+            model_id = FINETUNE_MODELS[model_display_name]
+            short_name = model_display_name.split(" (")[0]
+            log_lines.append(f"[{i + 1}/{len(selected_models)}] Training {short_name}...")
+            try:
+                tokenizer = AutoTokenizer.from_pretrained(model_id)
+                model = AutoModelForSequenceClassification.from_pretrained(
+                    model_id, num_labels=num_labels,
+                )
+                train_ds = TextClassificationDataset(train_texts, train_labels, tokenizer, 512)
+                dev_ds = TextClassificationDataset(dev_texts, dev_labels, tokenizer, 512)
+                test_ds = TextClassificationDataset(test_texts, test_labels, tokenizer, 512)
+                output_dir = tempfile.mkdtemp(prefix='conflibert_cmp_')
+                best_metric = 'f1' if task_type == 'Binary' else 'f1_macro'
+                training_args = TrainingArguments(
+                    output_dir=output_dir,
+                    num_train_epochs=epochs,
+                    per_device_train_batch_size=batch_size,
+                    per_device_eval_batch_size=batch_size * 2,
+                    learning_rate=lr,
+                    weight_decay=0.01,
+                    warmup_ratio=0.1,
+                    eval_strategy='epoch',
+                    save_strategy='epoch',
+                    load_best_model_at_end=True,
+                    metric_for_best_model=best_metric,
+                    greater_is_better=True,
+                    logging_steps=50,
+                    save_total_limit=1,
+                    report_to='none',
+                    seed=42,
+                )
+                trainer = Trainer(
+                    model=model,
+                    args=training_args,
+                    train_dataset=train_ds,
+                    eval_dataset=dev_ds,
+                    compute_metrics=make_compute_metrics(task_type),
+                )
+                train_result = trainer.train()
+                # Get predictions for ROC curves
+                pred_output = trainer.predict(test_ds)
+                logits = pred_output.predictions
+                true_labels = pred_output.label_ids
+                probs = torch.softmax(torch.tensor(logits), dim=1).numpy()
+                roc_data[short_name] = (true_labels, probs)
+                # Collect classification metrics only
+                test_results = trainer.evaluate(test_ds, metric_key_prefix='test')
+                row = {'Model': short_name}
+                for k, v in sorted(test_results.items()):
+                    if not isinstance(v, (int, float, np.floating, np.integer)):
+                        continue
+                    name = k.replace('test_', '').replace('_', ' ').title()
+                    if name in keep_metrics:
+                        row[name] = round(float(v), 4)
+                results.append(row)
+                runtime = train_result.metrics.get('train_runtime', 0)
+                log_lines.append(f"    Done in {runtime:.1f}s")
+                del model, trainer, tokenizer, train_ds, dev_ds, test_ds
+                gc.collect()
+                if torch.cuda.is_available():
+                    torch.cuda.empty_cache()
+            except Exception as e:
+                log_lines.append(f"    Failed: {str(e)}")
+        log_lines.append(f"\nComparison complete. {len(results)} models evaluated.")
+        log_text = "\n".join(log_lines)
+        if not results:
+            return log_text, None, None, None, gr.Column(visible=False)
+        comparison_df = pd.DataFrame(results)
+        # --- Bar chart: classification metrics only ---
+        metric_cols = [c for c in comparison_df.columns if c in keep_metrics]
+        colors = ['#ff6b35', '#3b82f6', '#10b981', '#8b5cf6', '#f59e0b']
+        fig_bar = go.Figure()
+        for j, metric in enumerate(metric_cols):
+            fig_bar.add_trace(go.Bar(
+                name=metric,
+                x=comparison_df['Model'],
+                y=comparison_df[metric],
+                text=comparison_df[metric].apply(
+                    lambda x: f'{x:.3f}' if isinstance(x, float) else ''
+                ),
+                textposition='auto',
+                marker_color=colors[j % len(colors)],
+            ))
+        fig_bar.update_layout(
+            barmode='group',
+            yaxis_title='Score', yaxis_range=[0, 1.05],
+            template='plotly_white',
+            legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1),
+            height=400, margin=dict(t=40, b=40),
+        )
+        # --- ROC curves ---
+        model_colors = ['#ff6b35', '#3b82f6', '#10b981', '#8b5cf6',
+                        '#f59e0b', '#ec4899', '#06b6d4']
+        fig_roc = go.Figure()
+        for j, (model_name, (labels, probs)) in enumerate(roc_data.items()):
+            color = model_colors[j % len(model_colors)]
+            if num_labels == 2:
+                fpr, tpr, _ = roc_curve(labels, probs[:, 1])
+                roc_auc_val = sk_auc(fpr, tpr)
+                fig_roc.add_trace(go.Scatter(
+                    x=fpr, y=tpr, mode='lines',
+                    name=f'{model_name} (AUC = {roc_auc_val:.3f})',
+                    line=dict(color=color, width=2),
+                ))
+            else:
+                # Macro-average ROC for multiclass
+                labels_bin = label_binarize(labels, classes=list(range(num_labels)))
+                all_fpr = np.linspace(0, 1, 200)
+                mean_tpr = np.zeros_like(all_fpr)
+                for c in range(num_labels):
+                    fpr_c, tpr_c, _ = roc_curve(labels_bin[:, c], probs[:, c])
+                    mean_tpr += np.interp(all_fpr, fpr_c, tpr_c)
+                mean_tpr /= num_labels
+                roc_auc_val = sk_auc(all_fpr, mean_tpr)
+                fig_roc.add_trace(go.Scatter(
+                    x=all_fpr, y=mean_tpr, mode='lines',
+                    name=f'{model_name} (macro AUC = {roc_auc_val:.3f})',
+                    line=dict(color=color, width=2),
+                ))
+        # Diagonal reference line
+        fig_roc.add_trace(go.Scatter(
+            x=[0, 1], y=[0, 1], mode='lines',
+            line=dict(dash='dash', color='#ccc', width=1),
+            showlegend=False,
+        ))
+        fig_roc.update_layout(
+            xaxis_title='False Positive Rate',
+            yaxis_title='True Positive Rate',
+            template='plotly_white',
+            legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1),
+            height=400, margin=dict(t=40, b=40),
+        )
+        return log_text, comparison_df, fig_bar, fig_roc, gr.Column(visible=True)
+    except Exception as e:
+        return f"Comparison failed: {str(e)}", None, None, None, gr.Column(visible=False)
+# ============================================================================
+# THEME & CSS
+# ============================================================================
+utd_orange = gr.themes.Color(
+    c50="#fff7f3", c100="#ffead9", c200="#ffd4b3", c300="#ffb380",
+    c400="#ff8c52", c500="#ff6b35", c600="#e8551f", c700="#c2410c",
+    c800="#9a3412", c900="#7c2d12", c950="#431407",
+)
+theme = gr.themes.Soft(
+    primary_hue=utd_orange,
+    secondary_hue="neutral",
+    font=gr.themes.GoogleFont("Inter"),
+)
+custom_css = """
+/* Top accent bar */
+.gradio-container::before {
+    content: '';
+    display: block;
+    height: 4px;
+    background: linear-gradient(90deg, #ff6b35, #ff9f40, #ff6b35);
+    position: fixed;
+    top: 0;
+    left: 0;
+    right: 0;
+    z-index: 1000;
+}
+/* Active tab styling */
+.tab-nav button.selected {
+    border-bottom-color: #ff6b35 !important;
+    color: #ff6b35 !important;
+    font-weight: 600 !important;
+}
+/* Log output - monospace */
+.log-output textarea {
+    font-family: 'JetBrains Mono', 'Fira Code', 'Consolas', monospace !important;
+    font-size: 0.8rem !important;
+    line-height: 1.5 !important;
+}
+/* Dark mode: info callout adjustment */
+.dark .info-callout-inner {
+    background: rgba(255, 107, 53, 0.1) !important;
+    color: #ffead9 !important;
+}
+/* Clean container width */
+.gradio-container {
+    max-width: 1200px !important;
+}
+/* Smooth transitions */
+.gradio-container * {
+    transition: background-color 0.2s ease, border-color 0.2s ease !important;
+}
+"""
+# ============================================================================
+# GRADIO UI
+# ============================================================================
+with gr.Blocks(theme=theme, css=custom_css, title="ConfliBERT") as demo:
+    # ---- HEADER ----
+    gr.Markdown(
+        "<div style='text-align: center; padding: 1.5rem 0 0.5rem;'>"
+        "<h1 style='font-size: 2.5rem; font-weight: 800; margin: 0;'>"
+        "<a href='https://eventdata.utdallas.edu/conflibert/' target='_blank' "
+        "style='color: #ff6b35; text-decoration: none;'>ConfliBERT</a></h1>"
+        "<p style='color: #888; font-size: 0.95rem; margin: 0.25rem 0 0;'>"
+        "A Pretrained Language Model for Conflict and Political Violence</p></div>"
+    )
+    with gr.Tabs():
+        # ================================================================
+        # HOME TAB
+        # ================================================================
+        with gr.Tab("Home"):
+            gr.Markdown(
+                "## Welcome to ConfliBERT\n\n"
+                "ConfliBERT is a pretrained language model built specifically for "
+                "conflict and political violence text. This application lets you "
+                "run inference with ConfliBERT's pretrained models and fine-tune "
+                "your own classifiers on custom data. Use the tabs above to get started."
+            )
+            with gr.Row(equal_height=True):
+                with gr.Column():
+                    gr.Markdown(
+                        "### Inference\n\n"
+                        "Run pretrained ConfliBERT models on your text. "
+                        "Each task has its own tab with single-text analysis "
+                        "and CSV batch processing.\n\n"
+                        "**Named Entity Recognition**\n"
+                        "Identify persons, organizations, locations, weapons, "
+                        "and other entities in text. Results are color-coded "
+                        "by entity type.\n\n"
+                        "**Binary Classification**\n"
+                        "Determine whether text is related to conflict, violence, "
+                        "or politics (positive) or not (negative). You can also "
+                        "load a custom fine-tuned model here.\n\n"
+                        "**Multilabel Classification**\n"
+                        "Score text against four event categories: Armed Assault, "
+                        "Bombing/Explosion, Kidnapping, and Other. Each category "
+                        "is scored independently.\n\n"
+                        "**Question Answering**\n"
+                        "Provide a context passage and ask a question. The model "
+                        "extracts the most relevant answer span from the text."
+                    )
+                with gr.Column():
+                    gr.Markdown(
+                        "### Fine-tuning\n\n"
+                        "Train your own binary or multiclass text classifier "
+                        "on custom labeled data, all within the browser.\n\n"
+                        "**Workflow:**\n"
+                        "1. Upload your training, validation, and test data as "
+                        "TSV files (or load a built-in example dataset)\n"
+                        "2. Pick a base model: ConfliBERT, BERT, RoBERTa, "
+                        "ModernBERT, DeBERTa, or DistilBERT\n"
+                        "3. Configure training parameters (sensible defaults "
+                        "are provided)\n"
+                        "4. Train and watch progress in real time\n"
+                        "5. Review test-set metrics (accuracy, precision, "
+                        "recall, F1)\n"
+                        "6. Try your model on new text immediately\n"
+                        "7. Run batch predictions on a CSV\n"
+                        "8. Save the model and load it later in the "
+                        "Classification tab\n\n"
+                        "**Advanced features:**\n"
+                        "- Early stopping with configurable patience\n"
+                        "- Learning rate schedulers (linear, cosine, constant)\n"
+                        "- Mixed precision training (FP16 on CUDA GPUs)\n"
+                        "- Gradient accumulation for larger effective batch sizes\n"
+                        "- Weight decay regularization"
+                    )
+            gr.Markdown(
+                f"---\n\n"
+                f"**Your system:** {get_system_info()}"
+            )
+            gr.Markdown(
+                "**Citation:** Brandt, P.T., Alsarra, S., D'Orazio, V., "
+                "Heintze, D., Khan, L., Meher, S., Osorio, J. and Sianan, M., "
+                "2025. Extractive versus Generative Language Models for Political "
+                "Conflict Text Classification. *Political Analysis*, pp.1-29."
+            )
+        # ================================================================
+        # NER TAB
+        # ================================================================
+        with gr.Tab("Named Entity Recognition"):
+            gr.Markdown(info_callout(
+                "Identify entities in text such as **persons**, **organizations**, "
+                "**locations**, **weapons**, and more. Results are color-coded by type."
+            ))
+            with gr.Row(equal_height=True):
+                with gr.Column():
+                    ner_input = gr.Textbox(
+                        lines=6,
+                        placeholder="Paste or type text to analyze for entities...",
+                        label="Input Text",
+                    )
+                    ner_btn = gr.Button("Analyze Entities", variant="primary")
+                with gr.Column():
+                    ner_output = gr.HTML(label="Results")
+            with gr.Accordion("Batch Processing (CSV)", open=False):
+                gr.Markdown(
+                    "Upload a CSV file with a `text` column to process "
+                    "multiple texts at once."
+                )
+                with gr.Row():
+                    ner_csv_in = gr.File(
+                        label="Upload CSV", file_types=[".csv"],
+                    )
+                    ner_csv_out = gr.File(label="Download Results")
+                ner_csv_btn = gr.Button("Process CSV", variant="secondary")
+        # ================================================================
+        # BINARY CLASSIFICATION TAB
+        # ================================================================
+        with gr.Tab("Binary Classification"):
+            gr.Markdown(info_callout(
+                "Classify text as **conflict-related** (positive) or "
+                "**not conflict-related** (negative). Uses the pretrained ConfliBERT "
+                "binary classifier by default, or load your own finetuned model below."
+            ))
+            custom_clf_model = gr.State(None)
+            custom_clf_tokenizer = gr.State(None)
+            with gr.Row(equal_height=True):
+                with gr.Column():
+                    clf_input = gr.Textbox(
+                        lines=6,
+                        placeholder="Paste or type text to classify...",
+                        label="Input Text",
+                    )
+                    clf_btn = gr.Button("Classify", variant="primary")
+                with gr.Column():
+                    clf_output = gr.HTML(label="Results")
+            with gr.Accordion("Batch Processing (CSV)", open=False):
+                gr.Markdown("Upload a CSV file with a `text` column.")
+                with gr.Row():
+                    clf_csv_in = gr.File(label="Upload CSV", file_types=[".csv"])
+                    clf_csv_out = gr.File(label="Download Results")
+                clf_csv_btn = gr.Button("Process CSV", variant="secondary")
+            with gr.Accordion("Load Custom Model", open=False):
+                gr.Markdown(
+                    "Load a finetuned classification model from a local directory "
+                    "to use instead of the default pretrained classifier."
+                )
+                clf_model_path = gr.Textbox(
+                    label="Model directory path",
+                    placeholder="e.g., ./finetuned_model",
+                )
+                with gr.Row():
+                    clf_load_btn = gr.Button("Load Model", variant="secondary")
+                    clf_reset_btn = gr.Button(
+                        "Reset to Pretrained", variant="secondary",
+                    )
+                clf_status = gr.Markdown("")
+        # ================================================================
+        # MULTILABEL CLASSIFICATION TAB
+        # ================================================================
+        with gr.Tab("Multilabel Classification"):
+            gr.Markdown(info_callout(
+                "Identify multiple event types in text. Each category is scored "
+                "independently: **Armed Assault**, **Bombing/Explosion**, "
+                "**Kidnapping**, **Other**. Categories above 50% confidence "
+                "are highlighted."
+            ))
+            with gr.Row(equal_height=True):
+                with gr.Column():
+                    multi_input = gr.Textbox(
+                        lines=6,
+                        placeholder="Paste or type text to classify...",
+                        label="Input Text",
+                    )
+                    multi_btn = gr.Button("Classify", variant="primary")
+                with gr.Column():
+                    multi_output = gr.HTML(label="Results")
+            with gr.Accordion("Batch Processing (CSV)", open=False):
+                gr.Markdown("Upload a CSV file with a `text` column.")
+                with gr.Row():
+                    multi_csv_in = gr.File(label="Upload CSV", file_types=[".csv"])
+                    multi_csv_out = gr.File(label="Download Results")
+                multi_csv_btn = gr.Button("Process CSV", variant="secondary")
+        # ================================================================
+        # QUESTION ANSWERING TAB
+        # ================================================================
+        with gr.Tab("Question Answering"):
+            gr.Markdown(info_callout(
+                "Extract answers from a context passage. Provide a paragraph of "
+                "text and ask a question about it. The model will highlight the "
+                "most relevant span."
+            ))
+            with gr.Row(equal_height=True):
+                with gr.Column():
+                    qa_context = gr.Textbox(
+                        lines=6,
+                        placeholder="Paste the context passage here...",
+                        label="Context",
+                    )
+                    qa_question = gr.Textbox(
+                        lines=2,
+                        placeholder="What would you like to know?",
+                        label="Question",
+                    )
+                    qa_btn = gr.Button("Get Answer", variant="primary")
+                with gr.Column():
+                    qa_output = gr.HTML(label="Answer")
+        # ================================================================
+        # FINE-TUNE TAB
+        # ================================================================
+        with gr.Tab("Fine-tune"):
+            gr.Markdown(info_callout(
+                "Fine-tune a binary or multiclass classifier on your own data. "
+                "Upload labeled TSV files, pick a base model, and train. "
+                "Or compare multiple models head-to-head on the same dataset."
+            ))
+            # -- Data --
+            gr.Markdown("### Data")
+            gr.Markdown(
+                "TSV files, no header, format: `text[TAB]label` "
+                "(binary: 0/1, multiclass: 0, 1, 2, ...)"
+            )
+            with gr.Row():
+                ft_ex_binary_btn = gr.Button(
+                    "Load Example: Binary", variant="secondary", size="sm",
+                )
+                ft_ex_multi_btn = gr.Button(
+                    "Load Example: Multiclass (4 classes)", variant="secondary", size="sm",
+                )
+            with gr.Row():
+                ft_train_file = gr.File(
+                    label="Train", file_types=[".tsv", ".csv", ".txt"],
+                )
+                ft_dev_file = gr.File(
+                    label="Validation", file_types=[".tsv", ".csv", ".txt"],
+                )
+                ft_test_file = gr.File(
+                    label="Test", file_types=[".tsv", ".csv", ".txt"],
+                )
+            # -- Configuration --
+            gr.Markdown("### Configuration")
+            with gr.Row():
+                ft_task = gr.Radio(
+                    ["Binary", "Multiclass"],
+                    label="Task Type", value="Binary",
+                )
+                ft_model = gr.Dropdown(
+                    choices=list(FINETUNE_MODELS.keys()),
+                    label="Base Model",
+                    value=list(FINETUNE_MODELS.keys())[0],
+                )
+            with gr.Row():
+                ft_epochs = gr.Number(
+                    label="Epochs", value=3, minimum=1, maximum=100, precision=0,
+                )
+                ft_batch = gr.Number(
+                    label="Batch Size", value=8, minimum=1, maximum=128, precision=0,
+                )
+                ft_lr = gr.Number(
+                    label="Learning Rate", value=2e-5, minimum=1e-7, maximum=1e-2,
+                )
+            with gr.Accordion("Advanced Settings", open=False):
+                with gr.Row():
+                    ft_weight_decay = gr.Number(
+                        label="Weight Decay", value=0.01, minimum=0, maximum=1,
+                    )
+                    ft_warmup = gr.Number(
+                        label="Warmup Ratio", value=0.1, minimum=0, maximum=0.5,
+                    )
+                    ft_max_len = gr.Number(
+                        label="Max Sequence Length", value=512,
+                        minimum=32, maximum=8192, precision=0,
+                    )
+                with gr.Row():
+                    ft_grad_accum = gr.Number(
+                        label="Gradient Accumulation", value=1,
+                        minimum=1, maximum=64, precision=0,
+                    )
+                    ft_fp16 = gr.Checkbox(
+                        label="Mixed Precision (FP16)", value=False,
+                    )
+                    ft_patience = gr.Number(
+                        label="Early Stopping Patience", value=3,
+                        minimum=0, maximum=20, precision=0,
+                    )
+                ft_scheduler = gr.Dropdown(
+                    ["linear", "cosine", "constant", "constant_with_warmup"],
+                    label="LR Scheduler", value="linear",
+                )
+            # -- Train --
+            ft_train_btn = gr.Button(
+                "Start Training", variant="primary", size="lg",
+            )
+            # State for the trained model
+            ft_model_state = gr.State(None)
+            ft_tokenizer_state = gr.State(None)
+            ft_num_labels_state = gr.State(None)
+            with gr.Accordion("Training Log", open=False) as ft_log_accordion:
+                ft_log = gr.Textbox(
+                    lines=12, interactive=False, elem_classes="log-output",
+                    show_label=False,
+                )
+            # -- Results + Try Model (hidden until training completes) --
+            with gr.Column(visible=False) as ft_results_col:
+                gr.Markdown("### Results")
+                with gr.Row(equal_height=True):
+                    with gr.Column(scale=2):
+                        ft_metrics = gr.Dataframe(
+                            label="Test Set Metrics",
+                            headers=["Metric", "Score"],
+                            interactive=False,
+                        )
+                    with gr.Column(scale=3):
+                        gr.Markdown("**Try your model**")
+                        ft_try_input = gr.Textbox(
+                            lines=2, label="Input Text",
+                            placeholder="Type text to classify...",
+                        )
+                        with gr.Row():
+                            ft_try_btn = gr.Button("Predict", variant="primary")
+                        ft_try_output = gr.HTML(label="Prediction")
+            # -- Save + Batch (hidden until training completes) --
+            with gr.Column(visible=False) as ft_actions_col:
+                with gr.Row(equal_height=True):
+                    with gr.Column():
+                        gr.Markdown("**Save model**")
+                        ft_save_path = gr.Textbox(
+                            label="Save Directory", value="./finetuned_model",
+                        )
+                        ft_save_btn = gr.Button("Save", variant="secondary")
+                        ft_save_status = gr.Markdown("")
+                    with gr.Column():
+                        gr.Markdown("**Batch predictions**")
+                        ft_batch_in = gr.File(
+                            label="Upload CSV (needs 'text' column)",
+                            file_types=[".csv"],
+                        )
+                        ft_batch_btn = gr.Button(
+                            "Run Predictions", variant="secondary",
+                        )
+                        ft_batch_out = gr.File(label="Download Results")
+            # -- Compare Models --
+            gr.Markdown("---")
+            with gr.Accordion("Compare Multiple Models", open=False):
+                gr.Markdown(
+                    "Train the same dataset on different base models and compare "
+                    "performance side by side. Uses the data and task type above."
+                )
+                cmp_models = gr.CheckboxGroup(
+                    choices=list(FINETUNE_MODELS.keys()),
+                    label="Select models to compare (pick 2 or more)",
+                )
+                with gr.Row():
+                    cmp_epochs = gr.Number(label="Epochs", value=3, minimum=1, precision=0)
+                    cmp_batch = gr.Number(label="Batch Size", value=8, minimum=1, precision=0)
+                    cmp_lr = gr.Number(label="Learning Rate", value=2e-5, minimum=1e-7)
+                cmp_btn = gr.Button("Compare Models", variant="primary")
+                cmp_log = gr.Textbox(
+                    label="Comparison Log", lines=8,
+                    interactive=False, elem_classes="log-output",
+                )
+                with gr.Column(visible=False) as cmp_results_col:
+                    cmp_table = gr.Dataframe(
+                        label="Comparison Results", interactive=False,
+                    )
+                    cmp_plot = gr.Plot(label="Metrics Comparison")
+                    cmp_roc = gr.Plot(label="ROC Curves")
+    # ---- FOOTER ----
+    gr.Markdown(
+        "<div style='text-align: center; padding: 1rem 0; margin-top: 0.5rem; "
+        "border-top: 1px solid #e5e7eb;'>"
+        "<p style='color: #888; font-size: 0.85rem; margin: 0;'>"
+        "Developed by "
+        "<a href='http://shreyasmeher.com' target='_blank' "
+        "style='color: #ff6b35; text-decoration: none;'>Shreyas Meher</a>"
+        "</p>"
+        "<p style='color: #999; font-size: 0.75rem; margin: 0.5rem 0 0; "
+        "max-width: 700px; margin-left: auto; margin-right: auto; line-height: 1.4;'>"
+        "If you use ConfliBERT in your research, please cite:<br>"
+        "<em>Brandt, P.T., Alsarra, S., D'Orazio, V., Heintze, D., Khan, L., "
+        "Meher, S., Osorio, J. and Sianan, M., 2025. Extractive versus Generative "
+        "Language Models for Political Conflict Text Classification. "
+        "Political Analysis, pp.1&ndash;29.</em>"
+        "</p></div>"
+    )
+    # ====================================================================
+    # EVENT HANDLERS
+    # ====================================================================
+    # NER
+    ner_btn.click(
+        fn=named_entity_recognition, inputs=[ner_input], outputs=[ner_output],
+    )
+    ner_csv_btn.click(
+        fn=process_csv_ner, inputs=[ner_csv_in], outputs=[ner_csv_out],
+    )
+    # Binary Classification
+    clf_btn.click(
+        fn=text_classification,
+        inputs=[clf_input, custom_clf_model, custom_clf_tokenizer],
+        outputs=[clf_output],
+    )
+    clf_csv_btn.click(
+        fn=process_csv_binary,
+        inputs=[clf_csv_in, custom_clf_model, custom_clf_tokenizer],
+        outputs=[clf_csv_out],
+    )
+    clf_load_btn.click(
+        fn=load_custom_model,
+        inputs=[clf_model_path],
+        outputs=[custom_clf_model, custom_clf_tokenizer, clf_status],
+    )
+    clf_reset_btn.click(
+        fn=reset_custom_model,
+        outputs=[custom_clf_model, custom_clf_tokenizer, clf_status],
+    )
+    # Multilabel Classification
+    multi_btn.click(
+        fn=multilabel_classification, inputs=[multi_input], outputs=[multi_output],
+    )
+    multi_csv_btn.click(
+        fn=process_csv_multilabel, inputs=[multi_csv_in], outputs=[multi_csv_out],
+    )
+    # Question Answering
+    qa_btn.click(
+        fn=question_answering,
+        inputs=[qa_context, qa_question],
+        outputs=[qa_output],
+    )
+    # Fine-tuning: example dataset loaders
+    ft_ex_binary_btn.click(
+        fn=load_example_binary,
+        outputs=[ft_train_file, ft_dev_file, ft_test_file, ft_task],
+    )
+    ft_ex_multi_btn.click(
+        fn=load_example_multiclass,
+        outputs=[ft_train_file, ft_dev_file, ft_test_file, ft_task],
+    )
+    # Fine-tuning: training
+    ft_train_btn.click(
+        fn=run_finetuning,
+        inputs=[
+            ft_train_file, ft_dev_file, ft_test_file,
+            ft_task, ft_model,
+            ft_epochs, ft_batch, ft_lr,
+            ft_weight_decay, ft_warmup, ft_max_len,
+            ft_grad_accum, ft_fp16, ft_patience, ft_scheduler,
+        ],
+        outputs=[
+            ft_log, ft_metrics,
+            ft_model_state, ft_tokenizer_state, ft_num_labels_state,
+            ft_results_col, ft_actions_col,
+        ],
+        concurrency_limit=1,
+    )
+    # Try finetuned model
+    ft_try_btn.click(
+        fn=predict_finetuned,
+        inputs=[ft_try_input, ft_model_state, ft_tokenizer_state, ft_num_labels_state],
+        outputs=[ft_try_output],
+    )
+    # Save finetuned model
+    ft_save_btn.click(
+        fn=save_finetuned_model,
+        inputs=[ft_save_path, ft_model_state, ft_tokenizer_state],
+        outputs=[ft_save_status],
+    )
+    # Batch predictions with finetuned model
+    ft_batch_btn.click(
+        fn=batch_predict_finetuned,
+        inputs=[ft_batch_in, ft_model_state, ft_tokenizer_state, ft_num_labels_state],
+        outputs=[ft_batch_out],
+    )
+    # Model comparison
+    cmp_btn.click(
+        fn=run_comparison,
+        inputs=[
+            ft_train_file, ft_dev_file, ft_test_file,
+            ft_task, cmp_models, cmp_epochs, cmp_batch, cmp_lr,
+        ],
+        outputs=[cmp_log, cmp_table, cmp_plot, cmp_roc, cmp_results_col],
+        concurrency_limit=1,
+    )
+# ============================================================================
+# LAUNCH
+# ============================================================================
+demo.launch(share=True)

examples/binary/dev.tsv ADDED Viewed

	@@ -0,0 +1,20 @@

+The peacekeeping mission reported increased hostilities along the ceasefire line	1
+The warlord's militia seized control of the strategically important bridge	1
+The naval forces intercepted a shipment of weapons destined for the rebels	1
+The terrorist organization released a video threatening further attacks	1
+Armed groups continue to recruit child soldiers in violation of international law	1
+Ethnic cleansing campaigns have forced entire communities to flee their lands	1
+The arms embargo was violated as new weapons flowed into the conflict zone	1
+Soldiers exchanged fire with suspected militants near the border crossing	1
+Multiple explosions were reported near the presidential palace overnight	1
+The rebel commander announced a new offensive targeting government supply routes	1
+The fashion designer presented a stunning collection at the annual style week	0
+The telecommunications company expanded its fiber network to rural areas	0
+The basketball league announced changes to the playoff format next season	0
+A new species of butterfly was documented during a biodiversity survey	0
+The pottery exhibition drew visitors from across the region	0
+The national library digitized thousands of historical manuscripts	0
+The airline reported record passenger numbers during the holiday season	0
+The botanical garden opened a new section dedicated to tropical plants	0
+Researchers developed a more efficient method for recycling plastic waste	0
+The chess tournament attracted grandmasters from over thirty countries	0

examples/binary/test.tsv ADDED Viewed

	@@ -0,0 +1,20 @@

+The siege of the coastal city entered its fourth month with supplies running low	1
+Rebel fighters launched rockets at the airport disrupting all flights	1
+The militia carried out reprisal attacks against civilians in the border towns	1
+Security forces clashed with armed protesters near the government district	1
+A landmine explosion killed three children walking to school in the rural area	1
+The insurgent group released hostages after weeks of tense negotiations	1
+Mortar rounds struck a hospital compound wounding patients and medical staff	1
+The warring parties rejected the latest ceasefire proposal from international mediators	1
+Government troops advanced into rebel-held territory following an intense bombardment	1
+An assassination attempt on the defense minister was foiled by security services	1
+The renewable energy conference attracted investors from around the world	0
+A new underwater cable was laid connecting the island to the mainland grid	0
+The jazz festival featured performances by legendary musicians and new artists	0
+The national team qualified for the world championships for the first time	0
+The organic food market experienced double-digit growth for the third straight year	0
+The space agency successfully launched a satellite to monitor ocean temperatures	0
+The vintage car rally drew enthusiasts from neighboring countries	0
+The children's hospital received a generous donation for its expansion project	0
+The urban farming initiative transformed abandoned lots into productive gardens	0
+The documentary about traditional boat-building won the top prize at the festival	0

examples/binary/train.tsv ADDED Viewed

	@@ -0,0 +1,81 @@

+Government forces launched an offensive against rebel positions in the northern province early this morning	1
+A car bomb exploded near a military checkpoint killing at least twelve soldiers	1
+Insurgents attacked a police station in the capital overnight leaving several officers wounded	1
+The militant group claimed responsibility for the ambush on a military convoy	1
+Heavy fighting broke out between rival armed factions in the disputed border region	1
+Security forces conducted raids targeting suspected members of the armed opposition	1
+A suicide bomber detonated explosives at a crowded marketplace injuring dozens of civilians	1
+The rebel forces captured a strategic town after weeks of intense battles	1
+Artillery shells struck residential areas as the conflict between the two sides intensified	1
+An airstrike destroyed a weapons depot used by the insurgent group	1
+The government declared a state of emergency following widespread political violence	1
+Armed men attacked a village killing several residents and burning homes	1
+Protesters clashed violently with police during demonstrations against the military regime	1
+A roadside bomb targeted a military patrol wounding three soldiers	1
+The armed group kidnapped aid workers operating in the conflict zone	1
+Sniper fire killed two civilians in the besieged neighborhood	1
+Military helicopters were deployed to support ground troops fighting in the eastern region	1
+An explosion at a government building was attributed to opposition fighters	1
+Cross-border shelling between the two nations continued for the third consecutive day	1
+Armed bandits attacked a refugee camp displacing thousands of people	1
+The guerrilla fighters ambushed a supply convoy on the main highway	1
+The opposing forces exchanged heavy gunfire throughout the night	1
+A mortar attack on the military base resulted in significant casualties	1
+Paramilitary groups carried out targeted assassinations of political opponents	1
+Government aircraft bombed suspected rebel strongholds in the mountainous region	1
+Two soldiers were killed when their vehicle struck a landmine on a rural road	1
+The separatist movement launched coordinated attacks on government installations	1
+Ethnic tensions erupted into open violence as rival communities clashed in the market	1
+Armed opposition forces shelled the outskirts of the capital city	1
+A grenade attack on a busy intersection killed four people and wounded many more	1
+The military junta deployed tanks to suppress the growing resistance movement	1
+Fighting between government troops and rebels displaced thousands of families	1
+Security operations intensified after a series of bombings in the commercial district	1
+A militia group took control of a key oil facility in the contested region	1
+An improvised explosive device was found near the parliament building	1
+Coalition forces conducted a night raid capturing several high-value targets	1
+The ongoing civil war has resulted in thousands of casualties and widespread destruction	1
+A drone strike targeted a meeting of senior militant commanders	1
+The opposition forces breached the defensive perimeter around the government compound	1
+Gunmen opened fire on a convoy of government officials killing two bodyguards	1
+The national football team secured a convincing victory in the qualifying match	0
+Temperatures are expected to reach record highs this weekend according to forecasters	0
+The technology company unveiled its latest smartphone with improved camera capabilities	0
+Stock markets rallied on news of stronger than expected economic growth	0
+The film festival announced its lineup featuring works from emerging directors	0
+Scientists discovered a new species of deep-sea fish in the Pacific Ocean	0
+The university announced a new scholarship program for students in engineering	0
+Local farmers reported an excellent harvest this season due to favorable weather	0
+The city council approved plans for a new public park in the downtown area	0
+A major software update was released improving performance and adding new features	0
+The marathon attracted over twenty thousand runners from across the country	0
+Researchers published findings on a promising treatment for a rare disorder	0
+The airline announced new direct flights connecting the capital with European cities	0
+A popular author released the highly anticipated sequel to her bestselling novel	0
+The automotive company revealed plans to launch three new electric vehicle models	0
+Annual tourism numbers reached an all-time high at the coastal resorts	0
+The construction of the new high-speed rail line is ahead of schedule	0
+Astronomers observed a rare celestial event visible from the southern hemisphere	0
+The bakery chain announced plans to expand into twelve new locations	0
+The tech startup raised significant funding in its latest investment round	0
+The swimming team broke the national record at the regional championships	0
+A new study found that regular exercise significantly reduces heart disease risk	0
+The city hosted a successful international food and wine festival	0
+Archaeologists uncovered ancient pottery at a dig site near the monument	0
+The pharmaceutical company received approval for a new vaccine formulation	0
+A local nonprofit organized a community cleanup event at the riverside park	0
+The solar energy project is expected to power thousands of homes by year end	0
+The gaming company released a new title that quickly became a bestseller	0
+Public transit ridership increased following improvements to the subway system	0
+The annual science fair showcased innovative projects by high school students	0
+The dairy industry adopted new standards for sustainable milk production	0
+The orchestra performed a sold-out concert of works by contemporary composers	0
+The hospital inaugurated a state-of-the-art wing dedicated to pediatric care	0
+The cycling tour attracted international competitors to the coastal route	0
+The agricultural ministry launched a program to support organic farming	0
+A popular streaming service announced an original series based on the classic novel	0
+The winter ski season opened early due to heavy snowfall in the mountains	0
+The electric vehicle charging network expanded to cover all major highways	0
+The oceanographic institute published research on coral reef restoration	0
+The cookbook featuring traditional regional recipes became an unexpected bestseller	0
+The museum opened a new exhibition showcasing contemporary sculpture and painting	0

examples/multiclass/dev.tsv ADDED Viewed

	@@ -0,0 +1,20 @@

+The peace envoy presented a revised framework for territorial compromise	0
+Regional leaders convened an emergency session to address the border crisis	0
+Negotiations on the arms limitation treaty entered their final round	0
+The joint commission agreed to establish demilitarized buffer zones	0
+A new diplomatic initiative aimed at ending the decades-long standoff was announced	0
+Militia forces overran a government checkpoint killing the defenders	1
+An ambush on the supply column resulted in the loss of critical equipment	1
+Fighter jets conducted precision strikes on command and control centers	1
+The battle for the provincial capital intensified with street-to-street fighting	1
+Landmines planted along the withdrawal route caused additional military casualties	1
+Pro-democracy activists organized a candlelight vigil outside the detention center	2
+Transport workers walked off the job paralyzing the rail network	2
+Demonstrators erected barricades across main roads in defiance of the curfew	2
+The environmental movement staged protests at industrial sites across the country	2
+Police fired rubber bullets at stone-throwing youths during the confrontation	2
+Emergency medical supplies were rushed to the hospital overwhelmed with casualties	3
+The displaced population established makeshift camps along the roadside	3
+Aid agencies warned of an impending water crisis in the drought-stricken region	3
+Rescue teams searched through rubble for survivors after the earthquake	3
+The nutrition program reached over ten thousand malnourished children this month	3

examples/multiclass/test.tsv ADDED Viewed

	@@ -0,0 +1,20 @@

+The multilateral agreement established protocols for maritime dispute resolution	0
+Both delegations expressed optimism following the fourth round of peace talks	0
+The international community welcomed the signing of the normalization agreement	0
+Economic sanctions were partially lifted following compliance with treaty obligations	0
+The mediation team proposed a phased withdrawal plan accepted by both parties	0
+The armored column advanced through the valley under heavy enemy fire	1
+Rockets struck the airfield destroying several military aircraft on the ground	1
+Special operations forces carried out a raid deep behind enemy lines	1
+The naval blockade prevented resupply of the besieged coastal garrison	1
+Anti-aircraft fire downed a reconnaissance drone over the contested territory	1
+Massive crowds filled the boulevard demanding free and fair elections	2
+The dock workers union expanded its strike to include all major ports	2
+Indigenous communities organized roadblocks to protest land seizure by corporations	2
+Thousands of women marched demanding an end to gender-based violence	2
+Student groups staged walkouts across dozens of universities nationwide	2
+The refugee crisis deepened as thousands more fled across the border overnight	3
+Field hospitals operated at capacity treating both military and civilian wounded	3
+The humanitarian airlift delivered critical medical supplies to the isolated town	3
+Sanitation conditions in the overcrowded camp raised fears of disease outbreaks	3
+International rescue teams deployed to assist with flood evacuation efforts	3

examples/multiclass/train.tsv ADDED Viewed

	@@ -0,0 +1,80 @@

+The two nations signed a bilateral trade agreement during the summit meeting	0
+Foreign ministers met to discuss the terms of the proposed peace deal	0
+The United Nations Security Council passed a resolution imposing new sanctions	0
+Diplomatic envoys were dispatched to mediate between the conflicting parties	0
+The ambassador presented new proposals for resolving the territorial dispute	0
+A ceasefire agreement was brokered by regional mediators after weeks of talks	0
+The peace conference concluded with a joint declaration of mutual cooperation	0
+International observers praised the diplomatic progress made at the negotiations	0
+The two governments established formal diplomatic relations for the first time	0
+Trade negotiations between the economic bloc and the developing nation resumed	0
+The foreign affairs committee approved the new bilateral defense cooperation pact	0
+A high-level delegation arrived in the capital for talks on nuclear disarmament	0
+The treaty on maritime boundaries was ratified by both nations parliaments	0
+International mediators proposed a roadmap for political transition and elections	0
+Leaders of the rival factions agreed to power-sharing arrangements at the talks	0
+The alliance pledged continued diplomatic support for the peace process	0
+An arms control agreement was reached limiting missile deployments in the region	0
+The special envoy shuttled between capitals seeking a breakthrough in negotiations	0
+Both sides agreed to exchange prisoners as a confidence-building measure	0
+The summit produced a framework for addressing cross-border resource disputes	0
+Government forces launched a major offensive against rebel positions in the east	1
+A car bomb exploded near the military headquarters killing at least eight people	1
+Insurgents ambushed a military convoy destroying several armored vehicles	1
+The air force conducted airstrikes on militant camps in the mountain region	1
+Heavy fighting erupted between government troops and separatist fighters	1
+A roadside bomb killed five soldiers on patrol near the contested border area	1
+The armed group captured a military outpost after a fierce overnight battle	1
+Artillery barrages devastated residential neighborhoods in the besieged city	1
+Coalition warplanes struck weapons storage facilities operated by the militia	1
+Two helicopter gunships were shot down during combat operations in the valley	1
+The rebel offensive resulted in the capture of three strategic hilltop positions	1
+A suicide attack on the army base left dozens of soldiers dead or wounded	1
+Snipers targeted civilians attempting to flee the fighting in the urban center	1
+Naval forces engaged enemy vessels in a brief but intense exchange of fire	1
+The battalion suffered heavy losses during the assault on the fortified position	1
+Ground forces advanced under cover of sustained aerial bombardment	1
+Mortar fire struck the refugee camp adjacent to the front lines	1
+Tank divisions moved into position along the disputed ceasefire line	1
+The garrison surrendered after running out of ammunition during the prolonged siege	1
+Drone surveillance identified enemy troop movements ahead of the counterattack	1
+Thousands of demonstrators gathered in the central square demanding government reform	2
+Riot police used tear gas to disperse crowds outside the parliament building	2
+Workers across the industrial sector launched a nationwide general strike	2
+Student protesters occupied the university administration building for three days	2
+The opposition organized mass rallies in cities across the country	2
+Police arrested dozens of activists during an unauthorized march through downtown	2
+Labor unions called for indefinite strikes to protest proposed austerity measures	2
+Demonstrators blocked major highways disrupting transportation and commerce	2
+Anti-government protests entered their second week with no sign of subsiding	2
+The youth movement organized sit-ins at government offices across the region	2
+Protesters set fire to government vehicles during clashes in the capital	2
+Civil society groups staged peaceful vigils demanding the release of political prisoners	2
+Tens of thousands marched against corruption in the largest demonstration in years	2
+The teachers union voted to strike over pay cuts and deteriorating school conditions	2
+Farmers drove tractors into the city center to protest agricultural subsidy reductions	2
+Activists chained themselves to the gates of the energy ministry	2
+The pro-democracy movement announced plans for a week of sustained civil disobedience	2
+Shopkeepers shuttered their businesses in solidarity with the striking workers	2
+University students clashed with security forces during a campus demonstration	2
+Residents organized neighborhood protests against the proposed construction project	2
+International aid organizations delivered food supplies to the displaced population	3
+The refugee camp expanded rapidly as thousands fled the advancing front lines	3
+Medical teams set up field hospitals to treat civilians injured in the crossfire	3
+The humanitarian corridor allowed evacuation of wounded from the conflict zone	3
+Food shortages reached critical levels as supply routes remained blocked	3
+Emergency shelters were established for families displaced by the flooding	3
+Aid workers distributed clean water and sanitation supplies to the affected areas	3
+The World Health Organization launched a vaccination campaign in the crisis region	3
+Thousands of refugees crossed the border seeking safety in neighboring countries	3
+The famine early warning system indicated severe food insecurity in the southern region	3
+Humanitarian agencies appealed for additional funding to support relief operations	3
+Displaced families struggled to find shelter as winter temperatures dropped sharply	3
+The Red Cross established a blood donation drive to support overwhelmed hospitals	3
+Emergency food rations were airlifted to communities cut off by the fighting	3
+Child protection agencies reported a surge in unaccompanied minors at border crossings	3
+The cholera outbreak in the camp prompted an emergency public health response	3
+International donors pledged millions in reconstruction aid at the conference	3
+Volunteer groups organized clothing and supply drives for the disaster survivors	3
+The malnutrition rate among children under five reached alarming levels	3
+Mobile clinics provided medical care to remote communities affected by the crisis	3