tejagowda
/

NextTokenPrediction

Model card Files Files and versions

tejagowda commited on Oct 11, 2024

Commit

3413b8d

·

verified ·

1 Parent(s): 56bebd8

Update README.md

Files changed (1) hide show

README.md +43 -1

README.md CHANGED Viewed

@@ -2,4 +2,46 @@
 language:
 - en
 - he
----

 language:
 - en
 - he
+---
+# Bilingual Language Model for Next Token Prediction
+## Overview
+This project focuses on building a neural network-based language model for next token prediction using two languages: **English** and **Hebrew**. The model is implemented using an LSTM (Long Short-Term Memory) architecture, designed to predict the next word in a sequence based on the training data provided. The project leverages Recurrent Neural Networks (RNNs) and evaluates the model using the **perplexity** metric to measure the quality of the predictions.
+The final model and checkpoints are provided, along with training history including perplexity and loss values.
+## Model Architecture
+- **Embedding Layer**: Converts tokenized words into dense vector representations.
+- **LSTM Layer**: Consists of 128 units to capture long-term dependencies in the sequence data.
+- **Dense Output Layer**: Outputs a probability distribution over the vocabulary to predict the next word.
+- **Total Vocabulary Size**: The model is trained on a corpus of size `[total_words]` (combining both English and Hebrew datasets).
+## Dataset
+The model is trained using a combination of English and Hebrew text datasets. The input sequences are tokenized and padded to ensure consistent input length for training the model.
+## Training
+The model was trained with the following parameters:
+- **Optimizer**: Adam
+- **Loss Function**: Categorical Crossentropy
+- **Batch Size**: 64
+- **Epochs**: 20
+- **Validation Split**: 20%
+## Evaluation Metric: Perplexity
+Perplexity is used to measure the model's performance, with lower perplexity indicating better generalization to unseen data. The final perplexity scores are:
+- **Final Training Perplexity**: `[Final Training Perplexity]`
+- **Final Validation Perplexity**: `[Final Validation Perplexity]`
+## Checkpoints
+A checkpoint mechanism is used to save the model at its best-performing stage based on validation loss. The best model checkpoint (`best_model.keras`) is included, which can be loaded for inference.
+## Results
+The model demonstrates competitive performance in predicting next tokens for both English and Hebrew, achieving satisfactory perplexity scores on both training and validation datasets.
+## How to Use
+To use this model, follow these steps:
+1. **Clone the repository**:
+   ```bash
+   git clone https://huggingface.co/username/model-name
+   cd model-name