Spaces:

MartinRodrigo
/

transformer-sentiment-analysis

Sleeping

App Files Files Community

Martin Rodrigo Morales commited on Oct 24, 2025

Commit

b3fbeb1

1 Parent(s): 6609649

Update: Simplified Gradio app with fine-tuned model

Browse files

Files changed (3) hide show

README.md +29 -38
app.py +285 -0
requirements.txt +4 -12

README.md CHANGED Viewed

@@ -4,61 +4,52 @@ emoji: 🤖
 colorFrom: blue
 colorTo: purple
 sdk: gradio
-sdk_version: "4.0"
-app_file: gradio_app.py
 pinned: false
-license: mit
 tags:
   - sentiment-analysis
   - transformers
-  - pytorch
-  - nlp
   - distilbert
-  - machine-learning
 models:
-  - distilbert-base-uncased-finetuned-sst-2-english
-datasets:
-  - imdb
-  - sst2
 ---
 # 🤖 Transformer Sentiment Analysis
-Advanced AI-powered sentiment analysis using state-of-the-art transformer models.
-## ✨ Features
-- **Real-time Analysis**: Instant sentiment classification with confidence scores
-- **Batch Processing**: Analyze multiple texts simultaneously
-- **Interactive Visualizations**: Probability distributions and analytics
-- **Professional Interface**: Modern, responsive UI design
-- **Production-Ready**: Optimized for performance and scalability
-## 🧠 Model Details
-- **Architecture**: DistilBERT (66M parameters)
-- **Performance**: 74% accuracy on IMDB dataset
-- **Speed**: ~100ms inference time
-- **Training**: Fine-tuned on Stanford Sentiment Treebank
-## 🚀 Tech Stack
-- **Framework**: PyTorch + Hugging Face Transformers
-- **Interface**: Gradio with custom CSS
-- **Backend**: FastAPI with async support
-- **Deployment**: Docker + Cloud platforms
-## 🎯 Use Cases
-- Social media monitoring
-- Customer feedback analysis
-- Market research insights
-- Product review classification
-## 🔗 Links
-- **GitHub Repository**: [Complete source code and documentation](https://github.com/mrdesautu/ransformer-sentiment-analysis)
-- **Live Demo**: Try the interactive demo above
-- **Documentation**: Comprehensive guides and API docs
-Built with modern ML engineering practices including comprehensive testing, CI/CD, and scalable deployment configurations.

 colorFrom: blue
 colorTo: purple
 sdk: gradio
+sdk_version: 4.44.0
+app_file: app.py
 pinned: false
+license: apache-2.0
 tags:
   - sentiment-analysis
   - transformers
   - distilbert
+  - mlflow
+  - pytorch
 models:
+  - MartinRodrigo/distilbert-sentiment-imdb
 ---
 # 🤖 Transformer Sentiment Analysis
+Advanced sentiment analysis using DistilBERT fine-tuned on IMDB dataset with MLflow experiment tracking.
+## 🎯 Model Performance
+- **Accuracy:** 80% on IMDB test set
+- **F1 Score:** 0.7981
+- **Model:** DistilBERT (66M parameters)
+- **Speed:** ~100ms per prediction
+## 🚀 Features
+- Real-time sentiment analysis
+- Batch text processing
+- Confidence scores and probabilities
+- Interactive visualizations
+## 🔗 Links
+- **Model Repository:** [MartinRodrigo/distilbert-sentiment-imdb](https://huggingface.co/MartinRodrigo/distilbert-sentiment-imdb)
+- **GitHub:** [transformer-sentiment-analysis](https://github.com/mrdesautu/ransformer-sentiment-analysis)
+## 💡 Usage
+```python
+from transformers import pipeline
+classifier = pipeline("sentiment-analysis",
+                     model="MartinRodrigo/distilbert-sentiment-imdb")
+result = classifier("I love this movie!")
+print(result)
+```
+Built with Transformers, MLflow, and Gradio 🚀

app.py ADDED Viewed

	@@ -0,0 +1,285 @@

+#!/usr/bin/env python3
+"""
+Gradio app for HuggingFace Spaces
+Sentiment analysis with MLflow metrics visualization
+"""
+import gradio as gr
+import torch
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import numpy as np
+import plotly.express as px
+import plotly.graph_objects as go
+import pandas as pd
+from typing import Dict, List, Tuple
+import logging
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+class SentimentAnalyzer:
+    """Sentiment analyzer for production"""
+    def __init__(self):
+        # Use the deployed model from HuggingFace Hub
+        self.model_name = "MartinRodrigo/distilbert-sentiment-imdb"
+        self.tokenizer = None
+        self.model = None
+        self.load_model()
+    def load_model(self):
+        """Load the fine-tuned model"""
+        try:
+            logger.info(f"Loading model: {self.model_name}")
+            self.tokenizer = AutoTokenizer.from_pretrained(self.model_name)
+            self.model = AutoModelForSequenceClassification.from_pretrained(self.model_name)
+            logger.info("Model loaded successfully!")
+        except Exception as e:
+            logger.error(f"Error loading model: {e}")
+            # Fallback to base model
+            logger.info("Falling back to base model...")
+            self.model_name = "distilbert-base-uncased-finetuned-sst-2-english"
+            self.tokenizer = AutoTokenizer.from_pretrained(self.model_name)
+            self.model = AutoModelForSequenceClassification.from_pretrained(self.model_name)
+    def analyze_single(self, text: str) -> Dict:
+        """Analyze sentiment of a single text"""
+        if not text.strip():
+            return {
+                "sentiment": "Please enter some text",
+                "confidence": 0.0,
+                "probabilities": None
+            }
+        try:
+            inputs = self.tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
+            with torch.no_grad():
+                outputs = self.model(**inputs)
+                predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
+            probs = predictions[0].numpy()
+            predicted_class = np.argmax(probs)
+            confidence = float(probs[predicted_class])
+            sentiment = "POSITIVE" if predicted_class == 1 else "NEGATIVE"
+            return {
+                "sentiment": sentiment,
+                "confidence": confidence,
+                "probabilities": {
+                    "Negative": float(probs[0]),
+                    "Positive": float(probs[1])
+                }
+            }
+        except Exception as e:
+            logger.error(f"Error in analysis: {e}")
+            return {
+                "sentiment": f"Error: {str(e)}",
+                "confidence": 0.0,
+                "probabilities": None
+            }
+    def analyze_batch(self, texts: List[str]) -> List[Dict]:
+        """Analyze multiple texts"""
+        results = []
+        for text in texts:
+            if text.strip():
+                results.append(self.analyze_single(text))
+        return results
+# Initialize analyzer
+analyzer = SentimentAnalyzer()
+def analyze_sentiment(text: str) -> Tuple[str, float, dict]:
+    """Main analysis function for Gradio"""
+    result = analyzer.analyze_single(text)
+    if result["probabilities"]:
+        df = pd.DataFrame([
+            {"Sentiment": "Negative", "Probability": result["probabilities"]["Negative"]},
+            {"Sentiment": "Positive", "Probability": result["probabilities"]["Positive"]}
+        ])
+        fig = px.bar(
+            df,
+            x="Sentiment",
+            y="Probability",
+            color="Sentiment",
+            color_discrete_map={"Negative": "#ff4444", "Positive": "#44ff44"},
+            title="Sentiment Probability Distribution"
+        )
+        fig.update_layout(showlegend=False, height=300)
+        return (
+            f"**{result['sentiment']}** (Confidence: {result['confidence']:.1%})",
+            result['confidence'],
+            fig
+        )
+    return result['sentiment'], result['confidence'], None
+def analyze_batch_texts(text_input: str) -> Tuple[str, dict]:
+    """Analyze multiple texts separated by newlines"""
+    if not text_input.strip():
+        return "Please enter some texts (one per line)", None
+    texts = [line.strip() for line in text_input.split('\n') if line.strip()]
+    if not texts:
+        return "No valid texts found", None
+    results = analyzer.analyze_batch(texts)
+    summary_lines = []
+    plot_data = []
+    for i, (text, result) in enumerate(zip(texts, results)):
+        sentiment = result['sentiment']
+        confidence = result['confidence']
+        summary_lines.append(f"{i+1}. **{sentiment}** ({confidence:.1%}) - {text[:50]}{'...' if len(text) > 50 else ''}")
+        plot_data.append({
+            "Text": f"Text {i+1}",
+            "Sentiment": sentiment,
+            "Confidence": confidence
+        })
+    summary = "\n".join(summary_lines)
+    if plot_data:
+        df = pd.DataFrame(plot_data)
+        fig = px.bar(
+            df,
+            x="Text",
+            y="Confidence",
+            color="Sentiment",
+            color_discrete_map={"NEGATIVE": "#ff4444", "POSITIVE": "#44ff44"},
+            title="Batch Analysis Results"
+        )
+        fig.update_layout(height=400)
+        return summary, fig
+    return summary, None
+# Demo examples
+EXAMPLES = [
+    "🎬 This movie absolutely blew my mind! Best film I've seen this year!",
+    "😞 Worst customer service ever. Total waste of money.",
+    "🚀 Revolutionary AI technology! Incredible understanding of language.",
+    "❌ I regret this purchase deeply. Poor quality materials.",
+    "✈️ Amazing travel experience! The hotel exceeded expectations!",
+    "🎵 Concert was phenomenal! Everything was absolutely perfect!"
+]
+BATCH_EXAMPLE = """🛍️ This online store has amazing customer service!
+😡 Terrible experience with their support team.
+⭐ Outstanding quality! Exceeded all my expectations.
+💸 Disappointed with this expensive purchase."""
+# Create Gradio interface
+with gr.Blocks(
+    title="🤖 Transformer Sentiment Analysis",
+    theme=gr.themes.Soft(
+        primary_hue="blue",
+        secondary_hue="purple",
+        neutral_hue="slate"
+    )
+) as demo:
+    gr.Markdown("""
+    # 🤖 Transformer Sentiment Analysis
+    Advanced AI-powered sentiment analysis using **DistilBERT** fine-tuned on IMDB dataset.
+    **Model Performance:**
+    - 🎯 Accuracy: **80%** on test set
+    - 📊 F1 Score: **0.7981**
+    - ⚡ Speed: ~100ms per prediction
+    - 🧠 Parameters: 66M (DistilBERT)
+    """)
+    with gr.Tabs():
+        with gr.TabItem("🔍 Single Analysis"):
+            with gr.Row():
+                with gr.Column(scale=2):
+                    single_input = gr.Textbox(
+                        label="💬 Enter your text",
+                        placeholder="Type your text here...",
+                        lines=4
+                    )
+                    single_btn = gr.Button("🚀 Analyze Sentiment", variant="primary", size="lg")
+                with gr.Column(scale=2):
+                    single_output = gr.Markdown(label="📋 Result")
+                    confidence_score = gr.Number(label="🎯 Confidence", precision=3)
+                    probability_plot = gr.Plot(label="📊 Probabilities")
+            gr.Examples(
+                examples=EXAMPLES,
+                inputs=single_input,
+                label="💡 Try these examples:"
+            )
+        with gr.TabItem("📊 Batch Processing"):
+            with gr.Row():
+                with gr.Column(scale=2):
+                    batch_input = gr.Textbox(
+                        label="📝 Multiple texts (one per line)",
+                        placeholder="Enter texts, one per line...",
+                        lines=8,
+                        value=BATCH_EXAMPLE
+                    )
+                    batch_btn = gr.Button("🚀 Process Batch", variant="primary", size="lg")
+                with gr.Column(scale=2):
+                    batch_output = gr.Markdown(label="📈 Results")
+                    batch_plot = gr.Plot(label="📊 Analytics")
+        with gr.TabItem("ℹ️ About"):
+            gr.Markdown("""
+            ## About This Model
+            ### 🏗️ Architecture
+            - **Model:** DistilBERT (Distilled BERT)
+            - **Parameters:** 66 million
+            - **Training:** Fine-tuned on IMDB dataset
+            - **Accuracy:** 80% on test set
+            ### ⚡ Performance
+            - **Speed:** ~100ms per prediction
+            - **Batch Processing:** Supported
+            - **Memory:** Optimized for production
+            ### 🚀 Tech Stack
+            - **Framework:** PyTorch + Transformers
+            - **Tracking:** MLflow experiments
+            - **UI:** Gradio
+            ### 🔗 Links
+            - **Model:** [MartinRodrigo/distilbert-sentiment-imdb](https://huggingface.co/MartinRodrigo/distilbert-sentiment-imdb)
+            - **GitHub:** [transformer-sentiment-analysis](https://github.com/mrdesautu/ransformer-sentiment-analysis)
+            ---
+            Built with ❤️ using Transformers, MLflow, and Gradio
+            """)
+    # Event handlers
+    single_btn.click(
+        fn=analyze_sentiment,
+        inputs=single_input,
+        outputs=[single_output, confidence_score, probability_plot]
+    )
+    batch_btn.click(
+        fn=analyze_batch_texts,
+        inputs=batch_input,
+        outputs=[batch_output, batch_plot]
+    )
+if __name__ == "__main__":
+    demo.launch()

requirements.txt CHANGED Viewed

@@ -1,14 +1,6 @@
 transformers>=4.30.0
 torch>=2.0.0
-datasets>=2.0.0
-evaluate>=0.4.0
-scikit-learn>=1.0.0
-matplotlib>=3.5.0
-seaborn>=0.11.0
-numpy>=1.21.0
-pytest>=7.0.0
-fastapi>=0.100.0
-uvicorn[standard]>=0.20.0
-pydantic>=2.0.0
-python-multipart
-aiofiles

 transformers>=4.30.0
 torch>=2.0.0
+gradio>=4.0.0
+plotly>=5.0.0
+pandas>=1.5.0
+numpy>=1.24.0