Spaces:

parthnuwal7
/

ABSA

Sleeping

parthnuwal7 commited on Sep 30, 2025

Commit

f71c767

1 Parent(s): ee7ef03

ROLLBACK: Restore PyABSA approach for high accuracy

Restored PyABSA implementation in data_processor.py
Updated requirements for ML dependencies (torch, transformers, pyabsa)
Fixed requirements-docker.txt for HF Spaces deployment
Updated secrets template for PyABSA approach
Created comprehensive PyABSA deployment guide

Rationale: Higher accuracy needed than HF API transformers
Strategy: HF Spaces backend + Streamlit Cloud frontend
Next: Deploy backend to HF Spaces with PyABSA models

Files changed (5) hide show

.streamlit/secrets.toml.template +6 -7
PYABSA_DEPLOYMENT.md +195 -0
requirements-docker.txt +1 -1
requirements.txt +9 -12
src/utils/data_processor.py +142 -236

.streamlit/secrets.toml.template CHANGED Viewed

@@ -1,11 +1,10 @@
 # Streamlit Cloud Secrets Template
-# Copy this to your Streamlit Cloud app secrets
-# Hugging Face API Token for Inference API
-HF_TOKEN = "hf_your_token_here"
 # Instructions:
-# 1. Get your HF token from https://huggingface.co/settings/tokens
-# 2. Create a token with "Read" permissions
-# 3. Copy the token and paste it above
-# 4. In Streamlit Cloud: Go to app settings > Secrets > Paste this content

 # Streamlit Cloud Secrets Template
+# Copy this to your Streamlit Cloud app secrets if needed
+# Optional: Hugging Face token for additional model downloads
+# HF_TOKEN = "hf_your_token_here"
 # Instructions:
+# - PyABSA models will be downloaded automatically on first run
+# - HF_TOKEN is optional and only needed for restricted models
+# - Leave this file empty if no special tokens are needed

PYABSA_DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,195 @@

+# 🚀 PyABSA Deployment Guide: HF Spaces + Streamlit Cloud
+## Overview
+This guide covers deploying the high-accuracy PyABSA sentiment analysis application using a hybrid approach:
+- **Backend**: HF Spaces (Docker) for PyABSA processing
+- **Frontend**: Streamlit Cloud for the user interface
+## Why This Approach?
+✅ **High Accuracy**: PyABSA provides superior sentiment analysis compared to API-based solutions
+✅ **Reliability**: Local model processing eliminates API dependencies
+✅ **Scalability**: HF Spaces handles the heavy ML workload
+✅ **User Experience**: Streamlit Cloud provides fast frontend deployment
+## Architecture
+```
+User → Streamlit Cloud (Frontend) → HF Spaces (PyABSA Backend) → Results
+```
+## Deployment Steps
+### Phase 1: Deploy Backend to HF Spaces
+1. **Push to HF Spaces Repository**
+   ```bash
+   git push origin main
+   ```
+2. **Configure HF Spaces**
+   - Go to your HF Spaces settings
+   - Set the app type to "Docker"
+   - Hardware: CPU Basic (16GB RAM recommended for PyABSA)
+   - Dockerfile: Uses `requirements-docker.txt`
+3. **Monitor Deployment**
+   - First deployment takes 10-15 minutes (model downloads)
+   - Watch logs for PyABSA model loading
+   - Verify ABSA functionality works
+### Phase 2: Create Streamlit Cloud Frontend
+1. **Create Separate Frontend Repository**
+   ```bash
+   # Create a new repo for frontend-only version
+   git clone https://github.com/yourusername/your-repo.git frontend-app
+   cd frontend-app
+   ```
+2. **Modify for API Connection**
+   - Update `app_enhanced.py` to connect to HF Spaces backend
+   - Replace local processing with API calls to HF Spaces
+   - Keep all visualizations and UI components
+3. **Deploy to Streamlit Cloud**
+   - Connect GitHub repository
+   - Use lightweight `requirements.txt` (no PyABSA/torch)
+   - Set environment variables for HF Spaces API endpoint
+## Configuration Files
+### HF Spaces Configuration
+**`requirements-docker.txt`** (Heavy ML dependencies):
+```
+torch>=2.0.0,<2.2.0
+transformers>=4.30.0,<4.37.0
+pyabsa>=2.4.0,<3.0.0
+sentencepiece>=0.1.99
+sacremoses>=0.0.53
+faiss-cpu>=1.7.4
+# ... other dependencies
+```
+**`Dockerfile`** (Optimized for PyABSA):
+- Python 3.11 slim base
+- Proper cache directories for transformers
+- Non-root user for security
+- Port 7860 for HF Spaces
+### Streamlit Cloud Configuration
+**`requirements.txt`** (Lightweight frontend):
+```
+streamlit>=1.28.0
+pandas>=1.5.0
+plotly>=5.15.0
+requests>=2.31.0
+# No torch/transformers/pyabsa
+```
+## Troubleshooting
+### Common HF Spaces Issues
+1. **Model Download Timeout**
+   - Solution: Use CPU Basic with 16GB RAM
+   - Monitor logs for download progress
+2. **Memory Issues**
+   - Solution: Upgrade to better hardware tier
+   - Optimize model loading in data_processor.py
+3. **File Upload Issues**
+   - Solution: Check Dockerfile permissions
+   - Ensure data directories are writable
+### Common Streamlit Cloud Issues
+1. **API Connection Failures**
+   - Verify HF Spaces URL is correct
+   - Check network connectivity
+   - Add retry logic for API calls
+2. **Dependency Conflicts**
+   - Keep frontend requirements minimal
+   - Only include UI and API libraries
+## Performance Optimization
+### HF Spaces Backend
+- Use CPU-optimized PyTorch builds
+- Implement model caching
+- Add request batching for multiple reviews
+### Streamlit Cloud Frontend
+- Implement caching for API responses
+- Use progress indicators for long operations
+- Optimize chart rendering
+## Monitoring and Maintenance
+### Health Checks
+- Monitor HF Spaces uptime
+- Check model loading status
+- Verify API endpoints respond correctly
+### Updates
+1. Deploy backend changes to HF Spaces first
+2. Test API compatibility
+3. Update frontend to match new API contract
+4. Deploy frontend changes to Streamlit Cloud
+## Cost Considerations
+### HF Spaces
+- CPU Basic: ~$0.05/hour when running
+- Automatic shutdown when inactive
+- Pay only for usage
+### Streamlit Cloud
+- Community tier: Free
+- No resource limits for frontend-only apps
+## Security Notes
+- No sensitive data stored in either platform
+- File uploads processed securely
+- No permanent data storage
+- HTTPS encryption end-to-end
+## API Contract (Frontend ↔ Backend)
+### POST `/process-reviews`
+```json
+{
+  "reviews": ["Review text 1", "Review text 2"],
+  "options": {
+    "translate": true,
+    "extract_aspects": true
+  }
+}
+```
+### Response
+```json
+{
+  "processed_data": {...},
+  "absa_details": [...],
+  "analytics": {...}
+}
+```
+## Next Steps
+1. ✅ Deploy current version to HF Spaces
+2. ⚡ Create frontend-only version for Streamlit Cloud
+3. 🔗 Implement API communication layer
+4. 🚀 Test end-to-end functionality
+5. 📊 Monitor performance and optimize
+---
+*This deployment strategy provides the best of both worlds: PyABSA's accuracy with cloud-native scalability.*

requirements-docker.txt CHANGED Viewed

@@ -1,7 +1,7 @@
 # Core ML and NLP Libraries
 torch>=2.0.0,<2.2.0
 transformers>=4.30.0,<4.37.0
- pyabsa>=2.4.0,<3.0.0  # Commented out due to HF Spaces compatibility issues
 sentencepiece>=0.1.99
 sacremoses>=0.0.53
 faiss-cpu>=1.7.4

 # Core ML and NLP Libraries
 torch>=2.0.0,<2.2.0
 transformers>=4.30.0,<4.37.0
+pyabsa>=2.4.0,<3.0.0  # Restored for high accuracy ABSA
 sentencepiece>=0.1.99
 sacremoses>=0.0.53
 faiss-cpu>=1.7.4

requirements.txt CHANGED Viewed

@@ -1,6 +1,4 @@
-altair
-pandas
-# Streamlit Cloud Requirements - Optimized for API approach
 streamlit>=1.28.0
 pandas>=1.5.0
 numpy>=1.24.0
@@ -14,16 +12,15 @@ streamlit-option-menu>=0.3.6
 streamlit-aggrid>=0.3.4
 joblib>=1.3.0
 pillow>=10.0.0
-requests>=2.31.0
-faker>=18.0.0
 networkx>=3.0
 openpyxl>=3.1.0
 reportlab>=4.0.0
-# Removed heavy ML dependencies for API approach:
-# - torch (saved ~1GB)
-# - transformers (saved ~500MB)
-# - pyabsa (saved download issues)
-# - sentencepiece, sacremoses, faiss-cpu (not needed)
-# Using HF Inference API instead - much more reliable!

+# Production Streamlit Requirements - PyABSA Enhanced ABSA
 streamlit>=1.28.0
 pandas>=1.5.0
 numpy>=1.24.0
 streamlit-aggrid>=0.3.4
 joblib>=1.3.0
 pillow>=10.0.0
 networkx>=3.0
 openpyxl>=3.1.0
 reportlab>=4.0.0
+faker>=18.0.0
+# Enhanced ML Dependencies for High Accuracy ABSA
+torch>=1.13.0
+transformers>=4.30.0
+pyabsa>=2.4.0
+sentencepiece>=0.1.99
+sacremoses>=0.0.53
+faiss-cpu>=1.7.4

src/utils/data_processor.py CHANGED Viewed

@@ -14,9 +14,6 @@ import streamlit as st
 from collections import Counter, defaultdict
 from itertools import combinations
 import networkx as nx
-import requests
-import os
-import time
 # Set up logging
 logging.basicConfig(level=logging.INFO)
@@ -82,59 +79,27 @@ class DataValidator:
 class TranslationService:
-    """Handles translation using HF Inference API - much more reliable than local models."""
     def __init__(self):
-        self.api_token = self._get_hf_token()
-        self.translation_model = "facebook/m2m100_418M"
-        self.base_url = "https://api-inference.huggingface.co/models"
-        logger.info("Initialized HF Inference API for translation")
-    def _get_hf_token(self) -> Optional[str]:
-        """Get HF token from environment or Streamlit secrets."""
-        try:
-            return st.secrets["HF_TOKEN"]
-        except:
-            pass
-        token = os.getenv("HF_TOKEN")
-        if not token:
-            logger.warning("No HF_TOKEN found. Translation will be limited.")
-        return token
-    def _call_hf_translation_api(self, text: str, source_lang: str = "hi", target_lang: str = "en") -> str:
-        """Call HF Translation API with fallback."""
-        if not self.api_token:
-            logger.warning("No API token, skipping translation")
-            return text
         try:
-            headers = {"Authorization": f"Bearer {self.api_token}"}
-            url = f"{self.base_url}/{self.translation_model}"
-            # Format input for M2M100
-            payload = {
-                "inputs": text,
-                "parameters": {
-                    "src_lang": source_lang,
-                    "tgt_lang": target_lang
-                }
-            }
-            response = requests.post(url, headers=headers, json=payload, timeout=30)
-            if response.status_code == 200:
-                result = response.json()
-                if isinstance(result, list) and len(result) > 0:
-                    return result[0].get("translation_text", text)
-            logger.warning(f"Translation API failed: {response.status_code}")
-            return text
         except Exception as e:
-            logger.error(f"Translation error: {str(e)}")
-            return text
     def detect_language(self, text: str) -> str:
         """Detect language of the text."""
         try:
@@ -142,230 +107,171 @@ class TranslationService:
             return lang
         except:
             return 'unknown'
     def translate_to_english(self, text: str, source_lang: str = 'hi') -> str:
         """
-        Translate text to English using HF API.
         Args:
             text: Text to translate
             source_lang: Source language code
         Returns:
             Translated text
         """
-        if source_lang == 'en' or source_lang == 'unknown':
             return text
-        return self._call_hf_translation_api(text, source_lang, "en")
     def process_reviews(self, reviews: List[str]) -> Tuple[List[str], List[str]]:
         """
         Process list of reviews for translation.
         Args:
             reviews: List of review texts
         Returns:
             Tuple of (translated_reviews, detected_languages)
         """
         translated_reviews = []
         detected_languages = []
-        for i, review in enumerate(reviews):
-            if i % 20 == 0:  # Progress logging
-                logger.info(f"Processing translation {i+1}/{len(reviews)}")
             lang = self.detect_language(review)
             detected_languages.append(lang)
             if lang == 'hi':  # Hindi detected
                 translated = self.translate_to_english(review, 'hi')
                 translated_reviews.append(translated)
             else:
                 translated_reviews.append(review)  # Keep original if not Hindi
         return translated_reviews, detected_languages
 class ABSAProcessor:
-    """Enhanced ABSA using Hugging Face Inference API - much more reliable for production."""
     def __init__(self):
-        self.api_token = self._get_hf_token()
-        self.sentiment_model = "cardiffnlp/twitter-roberta-base-sentiment-latest"
-        self.base_url = "https://api-inference.huggingface.co/models"
-        logger.info("Initialized HF Inference API for ABSA processing")
-    def _get_hf_token(self) -> Optional[str]:
-        """Get HF token from environment or Streamlit secrets."""
-        # Try Streamlit secrets first
         try:
-            return st.secrets["HF_TOKEN"]
-        except:
-            pass
-        # Try environment variable
-        token = os.getenv("HF_TOKEN")
-        if not token:
-            logger.warning("No HF_TOKEN found. Some features may be limited.")
-        return token
-    def _call_hf_api(self, model_name: str, inputs: str, max_retries: int = 3) -> Dict:
-        """Call HF Inference API with retry logic."""
-        headers = {}
-        if self.api_token:
-            headers["Authorization"] = f"Bearer {self.api_token}"
-        url = f"{self.base_url}/{model_name}"
-        payload = {"inputs": inputs}
-        for attempt in range(max_retries):
-            try:
-                response = requests.post(url, headers=headers, json=payload, timeout=30)
-                if response.status_code == 503:
-                    # Model is loading, wait and retry
-                    wait_time = 2 ** attempt  # Exponential backoff
-                    logger.info(f"Model loading, waiting {wait_time}s before retry...")
-                    time.sleep(wait_time)
-                    continue
-                response.raise_for_status()
-                return response.json()
-            except requests.exceptions.RequestException as e:
-                logger.error(f"API call failed (attempt {attempt + 1}): {str(e)}")
-                if attempt == max_retries - 1:
-                    return {"error": str(e)}
-                time.sleep(1)
-        return {"error": "Max retries exceeded"}
     def extract_aspects_and_sentiments(self, reviews: List[str]) -> List[Dict[str, Any]]:
         """
-        Extract aspects and sentiments using HF Inference API and rule-based aspects.
         Args:
             reviews: List of review texts
         Returns:
             List of dictionaries containing extracted information
         """
-        logger.info(f"Processing {len(reviews)} reviews with HF Inference API")
-        processed_results = []
-        for i, review in enumerate(reviews):
-            if i % 10 == 0:  # Progress logging
-                logger.info(f"Processing review {i+1}/{len(reviews)}")
-            # Get sentiment from HF API
-            sentiment = self._get_hf_sentiment(review)
-            # Extract aspects using rule-based approach
-            aspects = self._extract_simple_aspects(review)
-            processed_result = {
-                'sentence': review,
-                'aspects': aspects,
-                'sentiments': [sentiment] * len(aspects),
-                'positions': [[0, len(review)]] * len(aspects),
-                'confidence_scores': [0.8] * len(aspects),  # HF models are quite confident
-                'tokens': review.split(),
-                'iob_tags': ['O'] * len(review.split())
-            }
-            processed_results.append(processed_result)
-        logger.info(f"Successfully processed {len(processed_results)} reviews")
-        return processed_results
-    def _get_hf_sentiment(self, text: str) -> str:
-        """Get sentiment from HF Inference API with fallback."""
-        if not self.api_token:
-            # Fallback to rule-based if no API token
-            return self._get_rule_based_sentiment(text)
         try:
-            result = self._call_hf_api(self.sentiment_model, text)
-            if "error" in result:
-                logger.warning(f"API error, using rule-based fallback: {result['error']}")
-                return self._get_rule_based_sentiment(text)
-            # Parse HF sentiment result
-            if isinstance(result, list) and len(result) > 0:
-                predictions = result[0]
-                if isinstance(predictions, list) and len(predictions) > 0:
-                    top_prediction = max(predictions, key=lambda x: x.get('score', 0))
-                    label = top_prediction.get('label', 'NEUTRAL').upper()
-                    # Map HF labels to our format
-                    if 'POSITIVE' in label or 'POS' in label:
-                        return 'Positive'
-                    elif 'NEGATIVE' in label or 'NEG' in label:
-                        return 'Negative'
-                    else:
-                        return 'Neutral'
-            # Fallback if parsing fails
-            return self._get_rule_based_sentiment(text)
         except Exception as e:
-            logger.error(f"HF API error: {str(e)}, using rule-based fallback")
-            return self._get_rule_based_sentiment(text)
-    def _get_rule_based_sentiment(self, review: str) -> str:
-        """Fallback rule-based sentiment analysis."""
-        review_lower = review.lower()
-        # Enhanced sentiment words
-        positive_words = ['good', 'great', 'excellent', 'amazing', 'love', 'best', 'awesome',
-                         'fantastic', 'wonderful', 'perfect', 'satisfied', 'happy', 'pleased',
-                         'outstanding', 'brilliant', 'superb', 'delighted', 'impressed']
-        negative_words = ['bad', 'terrible', 'awful', 'hate', 'worst', 'horrible', 'poor',
-                         'disappointing', 'frustrated', 'angry', 'broken', 'failed', 'useless',
-                         'pathetic', 'disgusting', 'annoying', 'waste', 'regret']
-        pos_count = sum(1 for word in positive_words if word in review_lower)
-        neg_count = sum(1 for word in negative_words if word in review_lower)
-        if pos_count > neg_count:
-            return 'Positive'
-        elif neg_count > pos_count:
-            return 'Negative'
-        else:
-            return 'Neutral'
-    def _extract_simple_aspects(self, review: str) -> List[str]:
-        """Extract aspects using enhanced keyword matching."""
-        review_lower = review.lower()
-        aspects = []
-        # Enhanced aspect keywords
-        aspect_keywords = {
-            'Quality': ['quality', 'build', 'material', 'construction', 'durability', 'solid', 'sturdy', 'cheap', 'flimsy'],
-            'Price': ['price', 'cost', 'expensive', 'cheap', 'value', 'money', 'affordable', 'budget', 'worth'],
-            'Service': ['service', 'support', 'help', 'staff', 'customer', 'response', 'team', 'representative'],
-            'Delivery': ['delivery', 'shipping', 'fast', 'quick', 'slow', 'delayed', 'arrive', 'package'],
-            'Design': ['design', 'look', 'appearance', 'beautiful', 'ugly', 'style', 'color', 'aesthetic'],
-            'Performance': ['performance', 'speed', 'fast', 'slow', 'efficiency', 'works', 'function', 'smooth'],
-            'Usability': ['easy', 'difficult', 'user', 'interface', 'intuitive', 'complex', 'simple', 'confusing'],
-            'Features': ['feature', 'function', 'capability', 'option', 'setting', 'mode', 'tool'],
-            'Size': ['size', 'big', 'small', 'large', 'compact', 'tiny', 'huge', 'dimension'],
-            'Battery': ['battery', 'charge', 'power', 'energy', 'last', 'drain', 'life']
-        }
-        for aspect, keywords in aspect_keywords.items():
-            if any(keyword in review_lower for keyword in keywords):
-                aspects.append(aspect)
-        # Default aspect if none found
-        if not aspects:
-            aspects = ['General']
-        return aspects

 from collections import Counter, defaultdict
 from itertools import combinations
 import networkx as nx
 # Set up logging
 logging.basicConfig(level=logging.INFO)
 class TranslationService:
+    """Handles translation from Hindi to English using M2M100."""
     def __init__(self):
+        self.model = None
+        self.tokenizer = None
+        self._load_model()
+    def _load_model(self):
+        """Load M2M100 model for translation."""
         try:
+            from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
+            model_name = "facebook/m2m100_418M"
+            self.tokenizer = M2M100Tokenizer.from_pretrained(model_name)
+            self.model = M2M100ForConditionalGeneration.from_pretrained(model_name)
+            logger.info("Translation model loaded successfully")
         except Exception as e:
+            logger.error(f"Error loading translation model: {str(e)}")
+            st.error(f"Failed to load translation model: {str(e)}")
     def detect_language(self, text: str) -> str:
         """Detect language of the text."""
         try:
             return lang
         except:
             return 'unknown'
     def translate_to_english(self, text: str, source_lang: str = 'hi') -> str:
         """
+        Translate text to English.
         Args:
             text: Text to translate
             source_lang: Source language code
         Returns:
             Translated text
         """
+        if not self.model or not self.tokenizer:
             return text
+        try:
+            # Set source language
+            self.tokenizer.src_lang = source_lang
+            # Encode and translate
+            encoded = self.tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
+            # Generate translation
+            generated_tokens = self.model.generate(
+                **encoded,
+                forced_bos_token_id=self.tokenizer.get_lang_id("en"),
+                max_length=512,
+                num_beams=2,
+                early_stopping=True
+            )
+            # Decode translation
+            translation = self.tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]
+            return translation.strip()
+        except Exception as e:
+            logger.error(f"Translation error: {str(e)}")
+            return text
     def process_reviews(self, reviews: List[str]) -> Tuple[List[str], List[str]]:
         """
         Process list of reviews for translation.
         Args:
             reviews: List of review texts
         Returns:
             Tuple of (translated_reviews, detected_languages)
         """
         translated_reviews = []
         detected_languages = []
+        for review in reviews:
             lang = self.detect_language(review)
             detected_languages.append(lang)
             if lang == 'hi':  # Hindi detected
                 translated = self.translate_to_english(review, 'hi')
                 translated_reviews.append(translated)
             else:
                 translated_reviews.append(review)  # Keep original if not Hindi
         return translated_reviews, detected_languages
 class ABSAProcessor:
+    """Handles Aspect-Based Sentiment Analysis using pyABSA."""
     def __init__(self):
+        self.aspect_extractor = None
+        self._load_model()
+    def _load_model(self):
+        """Load pyABSA model with fallback error handling."""
         try:
+            # Import inside try block to catch any import-time type errors
+            import pyabsa
+            from pyabsa import ATEPCCheckpointManager
+            # Try multiple checkpoint options in order of preference
+            checkpoint_options = [
+                'multilingual',
+                'multilingual2',
+                None        # Let pyABSA use default
+            ]
+            for checkpoint in checkpoint_options:
+                try:
+                    logger.info(f"Attempting to load ABSA checkpoint: {checkpoint}")
+                    if checkpoint is None:
+                        # Try without specifying checkpoint
+                        self.aspect_extractor = ATEPCCheckpointManager.get_aspect_extractor(
+                            auto_device=True,
+                            task_code='ATEPC'
+                        )
+                    else:
+                        self.aspect_extractor = ATEPCCheckpointManager.get_aspect_extractor(
+                            checkpoint=checkpoint,
+                            auto_device=True,
+                            task_code='ATEPC'
+                        )
+                    logger.info(f"ABSA model loaded successfully with checkpoint: {checkpoint}")
+                    return  # Success, exit the method
+                except Exception as e:
+                    logger.warning(f"Failed to load checkpoint '{checkpoint}': {str(e)}")
+                    continue  # Try next checkpoint
+            # If all checkpoints failed
+            logger.error("All ABSA checkpoint options failed")
+            self.aspect_extractor = None
+        except ImportError as e:
+            logger.error(f"pyABSA library not available: {str(e)}")
+            st.warning("⚠️ ABSA functionality unavailable. Advanced sentiment analysis will be limited.")
+            self.aspect_extractor = None
+        except TypeError as e:
+            # Handle Python version compatibility issues
+            logger.error(f"Type compatibility error in pyABSA: {str(e)}")
+            st.warning("⚠️ ABSA model incompatible with current Python version. Using fallback sentiment analysis.")
+            self.aspect_extractor = None
+        except Exception as e:
+            logger.error(f"Error loading ABSA model: {str(e)}")
+            st.warning(f"⚠️ Could not load ABSA model: {str(e)[:100]}... Using basic sentiment analysis.")
+            self.aspect_extractor = None
     def extract_aspects_and_sentiments(self, reviews: List[str]) -> List[Dict[str, Any]]:
         """
+        Extract aspects and sentiments from reviews.
         Args:
             reviews: List of review texts
         Returns:
             List of dictionaries containing extracted information
         """
+        if not self.aspect_extractor:
+            logger.warning("ABSA model not available, returning empty results")
+            return []
         try:
+            results = self.aspect_extractor.extract_aspect(
+                reviews,
+                pred_sentiment=True
+            )
+            processed_results = []
+            for result in results:
+                processed_result = {
+                    'sentence': result['sentence'],
+                    'aspects': result.get('aspect', []),
+                    'sentiments': result.get('sentiment', []),
+                    'positions': result.get('position', []),
+                    'confidence_scores': result.get('confidence', []),
+                    'tokens': result.get('tokens', []),
+                    'iob_tags': result.get('IOB', [])
+                }
+                processed_results.append(processed_result)
+            return processed_results
         except Exception as e:
+            logger.error(f"ABSA extraction error: {str(e)}")
+            return []