tejagowda commited on
Commit
3413b8d
·
verified ·
1 Parent(s): 56bebd8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -1
README.md CHANGED
@@ -2,4 +2,46 @@
2
  language:
3
  - en
4
  - he
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  language:
3
  - en
4
  - he
5
+ ---
6
+ # Bilingual Language Model for Next Token Prediction
7
+
8
+ ## Overview
9
+ This project focuses on building a neural network-based language model for next token prediction using two languages: **English** and **Hebrew**. The model is implemented using an LSTM (Long Short-Term Memory) architecture, designed to predict the next word in a sequence based on the training data provided. The project leverages Recurrent Neural Networks (RNNs) and evaluates the model using the **perplexity** metric to measure the quality of the predictions.
10
+
11
+ The final model and checkpoints are provided, along with training history including perplexity and loss values.
12
+
13
+ ## Model Architecture
14
+ - **Embedding Layer**: Converts tokenized words into dense vector representations.
15
+ - **LSTM Layer**: Consists of 128 units to capture long-term dependencies in the sequence data.
16
+ - **Dense Output Layer**: Outputs a probability distribution over the vocabulary to predict the next word.
17
+ - **Total Vocabulary Size**: The model is trained on a corpus of size `[total_words]` (combining both English and Hebrew datasets).
18
+
19
+ ## Dataset
20
+ The model is trained using a combination of English and Hebrew text datasets. The input sequences are tokenized and padded to ensure consistent input length for training the model.
21
+
22
+ ## Training
23
+ The model was trained with the following parameters:
24
+ - **Optimizer**: Adam
25
+ - **Loss Function**: Categorical Crossentropy
26
+ - **Batch Size**: 64
27
+ - **Epochs**: 20
28
+ - **Validation Split**: 20%
29
+
30
+ ## Evaluation Metric: Perplexity
31
+ Perplexity is used to measure the model's performance, with lower perplexity indicating better generalization to unseen data. The final perplexity scores are:
32
+ - **Final Training Perplexity**: `[Final Training Perplexity]`
33
+ - **Final Validation Perplexity**: `[Final Validation Perplexity]`
34
+
35
+ ## Checkpoints
36
+ A checkpoint mechanism is used to save the model at its best-performing stage based on validation loss. The best model checkpoint (`best_model.keras`) is included, which can be loaded for inference.
37
+
38
+ ## Results
39
+ The model demonstrates competitive performance in predicting next tokens for both English and Hebrew, achieving satisfactory perplexity scores on both training and validation datasets.
40
+
41
+ ## How to Use
42
+ To use this model, follow these steps:
43
+
44
+ 1. **Clone the repository**:
45
+ ```bash
46
+ git clone https://huggingface.co/username/model-name
47
+ cd model-name