File size: 1,834 Bytes
c7f2b64 5ce3128 c7f2b64 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
# English to Hindi Translation (Quantized Model)
This repository contains a **quantized English-to-Hindi translation model** fine-tuned on the [`Aarif1430/english-to-hindi`](https://huggingface.co/datasets/Aarif1430/english-to-hindi) dataset and optimized using **dynamic quantization** for efficient CPU inference.
## π§ Model Details
- **Base model**: [`Helsinki-NLP/opus-mt-en-hi`](https://huggingface.co/Helsinki-NLP/opus-mt-en-hi)
- **Dataset**: Aarif1430/english-to-hindi
- **Training platform**: Kaggle (CUDA GPU)
- **Fine-tuned**: On English-Hindi pairs from the Hugging Face dataset
- **Quantization**: PyTorch Dynamic Quantization (`torch.quantization.quantize_dynamic`)
- **Tokenizer**: Saved alongside the model
## π Folder Structure
quantized_model/
βββ config.json
βββ pytorch_model.bin
βββ tokenizer_config.json
βββ tokenizer.json
βββ vocab.json / merges.txt
---
## π Usage
### πΉ 1. Load Quantized Model for Inference
```python
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("./quantized_model")
# Load quantized model
model = AutoModelForSeq2SeqLM.from_pretrained("./quantized_model")
model.eval()
# Run translation
translator = pipeline("translation_en_to_hi", model=model, tokenizer=tokenizer, device=-1)
text = "How are you?"
print("Hindi:", translator(text)[0]['translation_text'])
```
## Model Training Summary
- Loaded dataset: Aarif1430/english-to-hindi
- Mapped translation data: {"en": ..., "hi": ...} before training
- Training: 3 epochs using GPU
- Disabled: wandb logging
- Skipped: Evaluation phase
- Saved: Trained + Quantized model and tokenizer
- Quantization: torch.quantization.Quantize_dynamic is used for efficient CPU inference
|