Translator / README.md

Update README.md

1e1092f verified about 1 year ago

6.42 kB

	LSTM and Seq-to-Seq Language Translator
	This project implements language translation using two approaches:

	LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
	Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
	Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
	Model Architectures
	1. LSTM-Based Translator
	The LSTM model is built with the following components:

	Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
	Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
	Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
	2. Seq-to-Seq Translator
	The Seq-to-Seq model uses:

	Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
	Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.

	LSTM and Seq-to-Seq Language Translator
	This project implements language translation using two approaches:

	LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
	Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
	Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.

	Model Architectures
	1. LSTM-Based Translator
	The LSTM model is built with the following components:

	Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
	Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
	Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
	2. Seq-to-Seq Translator
	The Seq-to-Seq model uses:

	Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
	Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
	Dataset
	The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding.

	Preprocessing:

	Tokenization: Text is tokenized using Keras' Tokenizer.
	Padding: Sequences are padded to a fixed length for training.
	Vocabulary Sizes:
	English: 1000 pairs
	Hebrew: 1000 pairs

	Training Details
	Training Parameters:
	Optimizer: Adam
	Loss Function: Sparse Categorical Crossentropy
	Batch Size: 32
	Epochs: 20
	Validation Split: 20%
	Checkpoints:
	Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.

	Training Metrics:
	Both models track:

	Training Loss
	Validation Loss

	Evaluation Metrics
	1. BLEU Score:
	The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.

	LSTM Model BLEU: [BLEU Score for LSTM]
	Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq]
	2. CHRF Score:
	The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.

	LSTM Model CHRF: [CHRF Score for LSTM]
	Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]


	LSTM and Seq-to-Seq Language Translator
	This project implements language translation using two approaches:

	LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
	Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
	Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.

	Model Architectures
	1. LSTM-Based Translator
	The LSTM model is built with the following components:

	Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
	Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
	Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
	2. Seq-to-Seq Translator
	The Seq-to-Seq model uses:

	Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
	Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
	Dataset
	The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding.

	Preprocessing:

	Tokenization: Text is tokenized using Keras' Tokenizer.
	Padding: Sequences are padded to a fixed length for training.
	Vocabulary Sizes:
	English: [English Vocabulary Size]
	Hebrew: [Hebrew Vocabulary Size]
	Training Details
	Training Parameters:
	Optimizer: Adam
	Loss Function: Sparse Categorical Crossentropy
	Batch Size: 32
	Epochs: 20
	Validation Split: 20%
	Checkpoints:
	Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.

	Training Metrics:
	Both models track:

	Training Loss
	Validation Loss
	Evaluation Metrics
	1. BLEU Score:
	The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.

	LSTM Model BLEU: [BLEU Score for LSTM]
	Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq]
	2. CHRF Score:
	The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.

	LSTM Model CHRF: [CHRF Score for LSTM]
	Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]
	Results
	Training Loss Comparison: The Seq-to-Seq model achieved slightly better convergence compared to the LSTM model due to its structured architecture.
	Translation Quality: The BLEU and CHRF scores indicate that both models provide reasonable translations, with the Seq-to-Seq model performing better on longer sentences.

	Acknowledgments
	Dataset: [Custom Parallel Dataset]
	Evaluation Tools: PyTorch BLEU, SacreBLEU CHRF.