| LSTM and Seq-to-Seq Language Translator | |
| This project implements language translation using two approaches: | |
| LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture. | |
| Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew. | |
| Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores. | |
| Model Architectures | |
| 1. LSTM-Based Translator | |
| The LSTM model is built with the following components: | |
| Encoder: Embedding and LSTM layers to encode English input sequences into latent representations. | |
| Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token. | |
| Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence. | |
| 2. Seq-to-Seq Translator | |
| The Seq-to-Seq model uses: | |
| Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors. | |
| Decoder: Predicts the target sequence without attention, relying entirely on the encoded context. | |
| LSTM and Seq-to-Seq Language Translator | |
| This project implements language translation using two approaches: | |
| LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture. | |
| Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew. | |
| Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores. | |
| Model Architectures | |
| 1. LSTM-Based Translator | |
| The LSTM model is built with the following components: | |
| Encoder: Embedding and LSTM layers to encode English input sequences into latent representations. | |
| Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token. | |
| Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence. | |
| 2. Seq-to-Seq Translator | |
| The Seq-to-Seq model uses: | |
| Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors. | |
| Decoder: Predicts the target sequence without attention, relying entirely on the encoded context. | |
| Dataset | |
| The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding. | |
| Preprocessing: | |
| Tokenization: Text is tokenized using Keras' Tokenizer. | |
| Padding: Sequences are padded to a fixed length for training. | |
| Vocabulary Sizes: | |
| English: 1000 pairs | |
| Hebrew: 1000 pairs | |
| Training Details | |
| Training Parameters: | |
| Optimizer: Adam | |
| Loss Function: Sparse Categorical Crossentropy | |
| Batch Size: 32 | |
| Epochs: 20 | |
| Validation Split: 20% | |
| Checkpoints: | |
| Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint. | |
| Training Metrics: | |
| Both models track: | |
| Training Loss | |
| Validation Loss | |
| Evaluation Metrics | |
| 1. BLEU Score: | |
| The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations. | |
| LSTM Model BLEU: [BLEU Score for LSTM] | |
| Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq] | |
| 2. CHRF Score: | |
| The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations. | |
| LSTM Model CHRF: [CHRF Score for LSTM] | |
| Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq] | |
| LSTM and Seq-to-Seq Language Translator | |
| This project implements language translation using two approaches: | |
| LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture. | |
| Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew. | |
| Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores. | |
| Model Architectures | |
| 1. LSTM-Based Translator | |
| The LSTM model is built with the following components: | |
| Encoder: Embedding and LSTM layers to encode English input sequences into latent representations. | |
| Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token. | |
| Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence. | |
| 2. Seq-to-Seq Translator | |
| The Seq-to-Seq model uses: | |
| Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors. | |
| Decoder: Predicts the target sequence without attention, relying entirely on the encoded context. | |
| Dataset | |
| The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding. | |
| Preprocessing: | |
| Tokenization: Text is tokenized using Keras' Tokenizer. | |
| Padding: Sequences are padded to a fixed length for training. | |
| Vocabulary Sizes: | |
| English: [English Vocabulary Size] | |
| Hebrew: [Hebrew Vocabulary Size] | |
| Training Details | |
| Training Parameters: | |
| Optimizer: Adam | |
| Loss Function: Sparse Categorical Crossentropy | |
| Batch Size: 32 | |
| Epochs: 20 | |
| Validation Split: 20% | |
| Checkpoints: | |
| Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint. | |
| Training Metrics: | |
| Both models track: | |
| Training Loss | |
| Validation Loss | |
| Evaluation Metrics | |
| 1. BLEU Score: | |
| The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations. | |
| LSTM Model BLEU: [BLEU Score for LSTM] | |
| Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq] | |
| 2. CHRF Score: | |
| The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations. | |
| LSTM Model CHRF: [CHRF Score for LSTM] | |
| Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq] | |
| Results | |
| Training Loss Comparison: The Seq-to-Seq model achieved slightly better convergence compared to the LSTM model due to its structured architecture. | |
| Translation Quality: The BLEU and CHRF scores indicate that both models provide reasonable translations, with the Seq-to-Seq model performing better on longer sentences. | |
| Acknowledgments | |
| Dataset: [Custom Parallel Dataset] | |
| Evaluation Tools: PyTorch BLEU, SacreBLEU CHRF. |