tejagowda commited on
Commit
1e1092f
·
verified ·
1 Parent(s): f26c9e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +140 -0
README.md CHANGED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ LSTM and Seq-to-Seq Language Translator
2
+ This project implements language translation using two approaches:
3
+
4
+ LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
5
+ Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
6
+ Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
7
+ Model Architectures
8
+ 1. LSTM-Based Translator
9
+ The LSTM model is built with the following components:
10
+
11
+ Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
12
+ Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
13
+ Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
14
+ 2. Seq-to-Seq Translator
15
+ The Seq-to-Seq model uses:
16
+
17
+ Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
18
+ Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
19
+
20
+ LSTM and Seq-to-Seq Language Translator
21
+ This project implements language translation using two approaches:
22
+
23
+ LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
24
+ Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
25
+ Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
26
+
27
+ Model Architectures
28
+ 1. LSTM-Based Translator
29
+ The LSTM model is built with the following components:
30
+
31
+ Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
32
+ Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
33
+ Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
34
+ 2. Seq-to-Seq Translator
35
+ The Seq-to-Seq model uses:
36
+
37
+ Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
38
+ Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
39
+ Dataset
40
+ The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding.
41
+
42
+ Preprocessing:
43
+
44
+ Tokenization: Text is tokenized using Keras' Tokenizer.
45
+ Padding: Sequences are padded to a fixed length for training.
46
+ Vocabulary Sizes:
47
+ English: 1000 pairs
48
+ Hebrew: 1000 pairs
49
+
50
+ Training Details
51
+ Training Parameters:
52
+ Optimizer: Adam
53
+ Loss Function: Sparse Categorical Crossentropy
54
+ Batch Size: 32
55
+ Epochs: 20
56
+ Validation Split: 20%
57
+ Checkpoints:
58
+ Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.
59
+
60
+ Training Metrics:
61
+ Both models track:
62
+
63
+ Training Loss
64
+ Validation Loss
65
+
66
+ Evaluation Metrics
67
+ 1. BLEU Score:
68
+ The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.
69
+
70
+ LSTM Model BLEU: [BLEU Score for LSTM]
71
+ Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq]
72
+ 2. CHRF Score:
73
+ The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.
74
+
75
+ LSTM Model CHRF: [CHRF Score for LSTM]
76
+ Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]
77
+
78
+
79
+ LSTM and Seq-to-Seq Language Translator
80
+ This project implements language translation using two approaches:
81
+
82
+ LSTM-based Translator: A model that translates between English and Hebrew using a basic encoder-decoder architecture.
83
+ Seq-to-Seq Translator: A sequence-to-sequence model without attention for bidirectional translation between English and Hebrew.
84
+ Both models are trained on a parallel dataset of 1000 sentence pairs and evaluated using BLEU and CHRF scores.
85
+
86
+ Model Architectures
87
+ 1. LSTM-Based Translator
88
+ The LSTM model is built with the following components:
89
+
90
+ Encoder: Embedding and LSTM layers to encode English input sequences into latent representations.
91
+ Decoder: Embedding and LSTM layers initialized with the encoder's states, generating Hebrew translations token-by-token.
92
+ Dense Layer: A fully connected output layer with a softmax activation to predict the next word in the sequence.
93
+ 2. Seq-to-Seq Translator
94
+ The Seq-to-Seq model uses:
95
+
96
+ Encoder: Similar to the LSTM-based translator, this encodes the input sequence into context vectors.
97
+ Decoder: Predicts the target sequence without attention, relying entirely on the encoded context.
98
+ Dataset
99
+ The models are trained on a custom parallel dataset containing 1000 English-Hebrew sentence pairs, formatted as JSON with fields english and hebrew. The Hebrew text includes <start> and <end> tokens for better decoding.
100
+
101
+ Preprocessing:
102
+
103
+ Tokenization: Text is tokenized using Keras' Tokenizer.
104
+ Padding: Sequences are padded to a fixed length for training.
105
+ Vocabulary Sizes:
106
+ English: [English Vocabulary Size]
107
+ Hebrew: [Hebrew Vocabulary Size]
108
+ Training Details
109
+ Training Parameters:
110
+ Optimizer: Adam
111
+ Loss Function: Sparse Categorical Crossentropy
112
+ Batch Size: 32
113
+ Epochs: 20
114
+ Validation Split: 20%
115
+ Checkpoints:
116
+ Models are saved at their best-performing stages based on validation loss using Keras' ModelCheckpoint.
117
+
118
+ Training Metrics:
119
+ Both models track:
120
+
121
+ Training Loss
122
+ Validation Loss
123
+ Evaluation Metrics
124
+ 1. BLEU Score:
125
+ The BLEU metric evaluates the quality of translations by comparing them to reference translations. Higher BLEU scores indicate better translations.
126
+
127
+ LSTM Model BLEU: [BLEU Score for LSTM]
128
+ Seq-to-Seq Model BLEU: [BLEU Score for Seq-to-Seq]
129
+ 2. CHRF Score:
130
+ The CHRF metric evaluates translations using character-level F-scores. Higher CHRF scores indicate better translations.
131
+
132
+ LSTM Model CHRF: [CHRF Score for LSTM]
133
+ Seq-to-Seq Model CHRF: [CHRF Score for Seq-to-Seq]
134
+ Results
135
+ Training Loss Comparison: The Seq-to-Seq model achieved slightly better convergence compared to the LSTM model due to its structured architecture.
136
+ Translation Quality: The BLEU and CHRF scores indicate that both models provide reasonable translations, with the Seq-to-Seq model performing better on longer sentences.
137
+
138
+ Acknowledgments
139
+ Dataset: [Custom Parallel Dataset]
140
+ Evaluation Tools: PyTorch BLEU, SacreBLEU CHRF.