--- license: mit language: - en - vi metrics: - accuracy - f1 - recall - precision pipeline_tag: text-classification tags: - analysis - sentiment - text-classification --- # Sentiment Analysis Using LSTM and CNN This project implements a hybrid deep learning model combining **Long Short-Term Memory (LSTM)** networks and **Convolutional Neural Networks (CNN)** for sentiment analysis. The architecture leverages the strengths of both LSTM and CNN to process textual data and classify sentiments effectively. --- ## Model Architecture ![image](Overall.png) The architecture consists of two parallel branches that process the input text sequences and merge their outputs for final classification: ### **Branch 1: CNN-Based Processing** 1. **Embedding Layer**: Converts input sequences into dense vector representations. 2. **Conv1D + Activation**: Extracts local features from the text using convolutional filters. 3. **MaxPooling1D**: Reduces the spatial dimensions while retaining the most important features. 4. **BatchNormalization**: Normalizes the activations to stabilize and accelerate training. 5. **Conv1D + MaxPooling1D + BatchNormalization**: Repeats the convolution and pooling process to extract deeper features. 6. **Flatten**: Converts the 2D feature maps into a 1D vector. ### **Branch 2: LSTM-Based Processing** 1. **Embedding Layer**: Similar to the CNN branch, converts input sequences into dense vector representations. 2. **Bidirectional LSTM**: Captures long-term dependencies in the text by processing it in both forward and backward directions. 3. **LayerNormalization**: Normalizes the outputs of the LSTM layer. 4. **Bidirectional GRU**: Further processes the sequence with Gated Recurrent Units for efficiency. 5. **LayerNormalization**: Normalizes the GRU outputs. 6. **Flatten**: Converts the sequence outputs into a 1D vector. ### **Merging and Classification** 1. **Concatenate**: Combines the outputs of the CNN and LSTM branches. 2. **Dense Layers with Dropout**: Fully connected layers with ReLU activation and dropout for regularization. 3. **Output Layer**: A dense layer with a softmax activation function to classify the sentiment into three categories: Positive, Neutral, and Negative. --- ## Why LSTM + CNN for Sentiment Analysis? ### **LSTM Strengths** - LSTMs are well-suited for capturing long-term dependencies in sequential data, such as text. - They excel at understanding the context and relationships between words in a sentence. ### **CNN Strengths** - CNNs are effective at extracting local patterns and features, such as n-grams, from text data. - They are computationally efficient and can process data in parallel. ### **Hybrid Approach** By combining LSTM and CNN, the model benefits from: - **Contextual Understanding**: LSTM captures the sequential nature of text. - **Feature Extraction**: CNN identifies important local patterns. - **Robustness**: The merged architecture ensures better generalization and performance on sentiment classification tasks. --- ## Applications This model can be used for: - Social media sentiment analysis (e.g., Twitter, Reddit). - Customer feedback classification. - Opinion mining in reviews and surveys. --- ## Training and Evaluation The model is trained on labeled datasets with text and sentiment labels. It uses: - **Sparse Categorical Crossentropy** as the loss function. - **AdamW Optimizer** for efficient training. - **Early Stopping** and **Model Checkpoints** to prevent overfitting and save the best model. The performance is evaluated using metrics like accuracy, confusion matrix, and classification report. --- ## Conclusion The hybrid LSTM + CNN architecture provides a powerful framework for sentiment analysis, combining the strengths of sequential modeling and feature extraction. This approach is versatile and can be adapted to various text classification tasks. ## Lisence MIT Lisence