# LSTM Seq2Seq Model for Translation This repository contains the implementation of an LSTM-based Seq2Seq model for translation tasks. The model has been trained on a bilingual dataset and evaluated using BLEU and ChrF scores to measure translation quality. ## Model Architecture The model is a Seq2Seq architecture that uses: - **Embedding Layer**: To convert input tokens into dense vectors. - **LSTM Encoder**: To encode the source language sequences into a hidden representation. - **LSTM Decoder**: To generate the translated target language sequences from the hidden representation. - **Linear Layer**: To map the decoder output to the target vocabulary space. ## Training Details - **Training Loss**: Cross-entropy loss with padding tokens ignored. - **Optimizer**: Adam optimizer with a learning rate of 0.001. - **Number of Epochs**: 10 epochs. - **Batch Size**: 32. ## Evaluation Metrics The model's performance was evaluated using: - **BLEU Score**: A metric to measure the similarity between the generated and reference translations. - **ChrF Score**: A character-based metric for evaluating translation quality. ## Results The training and validation loss, along with BLEU and ChrF scores, were plotted to analyze the model's performance: - **Training Loss**: Decreased steadily over the epochs, indicating effective learning. - **Validation Loss**: Showed minimal improvement, suggesting potential overfitting. - **BLEU Score**: Improved gradually but remained relatively low, indicating that further tuning may be needed. - **ChrF Score**: Showed a consistent increase, reflecting better character-level accuracy in translations. ## Files Included - **LSTM_model.ipynb**: The Jupyter notebook containing the full implementation of the model, including data loading, training, and evaluation. - **bleu_scores.csv**: CSV file containing BLEU scores for each epoch. - **chrf_scores.csv**: CSV file containing ChrF scores for each epoch. - **loss_plot.png**: Plot of training and validation loss. - **bleu_score_plot.png**: Plot of BLEU scores over epochs. - **chrf_score_plot.png**: Plot of ChrF scores over epochs. ## Future Work - **Hyperparameter Tuning**: Experiment with different hyperparameters to improve model performance. - **Data Augmentation**: Use data augmentation techniques to improve the model's ability to generalize. - **Advanced Architectures**: Consider using attention mechanisms or transformer models for better performance. ## License This project is licensed under the MIT License. See the LICENSE file for more details.