tejagowda
/

Translator

Model card Files Files and versions

xet

Community

tejagowda commited on Nov 16, 2024

Commit

1e1092f

verified ·

1 Parent(s): f26c9e0

Update README.md

Browse files

Files changed (1) hide show

README.md +140 -0

README.md CHANGED Viewed

	@@ -0,0 +1,140 @@

+LSTM and Seq-to-Seq Language Translator
+This project implements language translation using two approaches:
+LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
+Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
+Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
+Model Architectures
+1. LSTM-Based Translator
+The LSTM model is built with the following components:
+Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
+Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
+Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
+2. Seq-to-Seq Translator
+The Seq-to-Seq model uses:
+Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
+Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
+LSTM and Seq-to-Seq Language Translator
+This project implements language translation using two approaches:
+LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
+Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
+Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
+Model Architectures
+1. LSTM-Based Translator
+The LSTM model is built with the following components:
+Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
+Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
+Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
+2. Seq-to-Seq Translator
+The Seq-to-Seq model uses:
+Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
+Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
+Dataset
+The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding.
+Preprocessing:
+Tokenization: Text is tokenized using Keras' Tokenizer.
+Padding: Sequences are padded to a fixed length for training.
+Vocabulary Sizes:
+English: 1000 pairs
+Hebrew: 1000 pairs
+Training Details
+Training Parameters:
+Optimizer: Adam
+Loss Function: Sparse Categorical Crossentropy
+Batch Size: 32
+Epochs: 20
+Validation Split: 20%
+Checkpoints:
+Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.
+Training Metrics:
+Both models track:
+Training Loss
+Validation Loss
+Evaluation Metrics
+1. BLEU Score:
+The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.
+LSTM Model BLEU: [BLEU Score for LSTM]
+Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq]
+2. CHRF Score:
+The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.
+LSTM Model CHRF: [CHRF Score for LSTM]
+Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]
+LSTM and Seq-to-Seq Language Translator
+This project implements language translation using two approaches:
+LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
+Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
+Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
+Model Architectures
+1. LSTM-Based Translator
+The LSTM model is built with the following components:
+Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
+Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
+Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
+2. Seq-to-Seq Translator
+The Seq-to-Seq model uses:
+Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
+Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
+Dataset
+The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding.
+Preprocessing:
+Tokenization: Text is tokenized using Keras' Tokenizer.
+Padding: Sequences are padded to a fixed length for training.
+Vocabulary Sizes:
+English: [English Vocabulary Size]
+Hebrew: [Hebrew Vocabulary Size]
+Training Details
+Training Parameters:
+Optimizer: Adam
+Loss Function: Sparse Categorical Crossentropy
+Batch Size: 32
+Epochs: 20
+Validation Split: 20%
+Checkpoints:
+Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.
+Training Metrics:
+Both models track:
+Training Loss
+Validation Loss
+Evaluation Metrics
+1. BLEU Score:
+The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.
+LSTM Model BLEU: [BLEU Score for LSTM]
+Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq]
+2. CHRF Score:
+The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.
+LSTM Model CHRF: [CHRF Score for LSTM]
+Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]
+Results
+Training Loss Comparison: The Seq-to-Seq model achieved slightly better convergence compared to the LSTM model due to its structured architecture.
+Translation Quality: The BLEU and CHRF scores indicate that both models provide reasonable translations, with the Seq-to-Seq model performing better on longer sentences.
+Acknowledgments
+Dataset: [Custom Parallel Dataset]
+Evaluation Tools: PyTorch BLEU, SacreBLEU CHRF.