# English to Hindi Translation (Quantized Model) This repository contains a **quantized English-to-Hindi translation model** fine-tuned on the [`Aarif1430/english-to-hindi`](https://huggingface.co/datasets/Aarif1430/english-to-hindi) dataset and optimized using **dynamic quantization** for efficient CPU inference. ## 🔧 Model Details - **Base model**: [`Helsinki-NLP/opus-mt-en-hi`](https://huggingface.co/Helsinki-NLP/opus-mt-en-hi) - **Dataset**: Aarif1430/english-to-hindi - **Training platform**: Kaggle (CUDA GPU) - **Fine-tuned**: On English-Hindi pairs from the Hugging Face dataset - **Quantization**: PyTorch Dynamic Quantization (`torch.quantization.quantize_dynamic`) - **Tokenizer**: Saved alongside the model ## 📁 Folder Structure quantized_model/ ├── config.json ├── pytorch_model.bin ├── tokenizer_config.json ├── tokenizer.json ├── vocab.json / merges.txt --- ## 🚀 Usage ### 🔹 1. Load Quantized Model for Inference ```python import torch from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline # Load tokenizer tokenizer = AutoTokenizer.from_pretrained("./quantized_model") # Load quantized model model = AutoModelForSeq2SeqLM.from_pretrained("./quantized_model") model.eval() # Run translation translator = pipeline("translation_en_to_hi", model=model, tokenizer=tokenizer, device=-1) text = "How are you?" print("Hindi:", translator(text)[0]['translation_text']) ``` ## Model Training Summary - Loaded dataset: Aarif1430/english-to-hindi - Mapped translation data: {"en": ..., "hi": ...} before training - Training: 3 epochs using GPU - Disabled: wandb logging - Skipped: Evaluation phase - Saved: Trained + Quantized model and tokenizer - Quantization: torch.quantization.Quantize_dynamic is used for efficient CPU inference