File size: 1,834 Bytes
c7f2b64
 
 
 
 
 
 
5ce3128
c7f2b64
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# English to Hindi Translation (Quantized Model)

This repository contains a **quantized English-to-Hindi translation model** fine-tuned on the [`Aarif1430/english-to-hindi`](https://huggingface.co/datasets/Aarif1430/english-to-hindi) dataset and optimized using **dynamic quantization** for efficient CPU inference.

## πŸ”§ Model Details

- **Base model**: [`Helsinki-NLP/opus-mt-en-hi`](https://huggingface.co/Helsinki-NLP/opus-mt-en-hi)
- **Dataset**: Aarif1430/english-to-hindi
- **Training platform**: Kaggle (CUDA GPU)
- **Fine-tuned**: On English-Hindi pairs from the Hugging Face dataset
- **Quantization**: PyTorch Dynamic Quantization (`torch.quantization.quantize_dynamic`)
- **Tokenizer**: Saved alongside the model

## πŸ“ Folder Structure

quantized_model/
β”œβ”€β”€ config.json
β”œβ”€β”€ pytorch_model.bin
β”œβ”€β”€ tokenizer_config.json
β”œβ”€β”€ tokenizer.json
β”œβ”€β”€ vocab.json / merges.txt


---

## πŸš€ Usage

### πŸ”Ή 1. Load Quantized Model for Inference

```python
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("./quantized_model")

# Load quantized model
model = AutoModelForSeq2SeqLM.from_pretrained("./quantized_model")
model.eval()

# Run translation
translator = pipeline("translation_en_to_hi", model=model, tokenizer=tokenizer, device=-1)

text = "How are you?"
print("Hindi:", translator(text)[0]['translation_text'])
```

## Model Training Summary

 - Loaded dataset: Aarif1430/english-to-hindi

 - Mapped translation data: {"en": ..., "hi": ...} before training

 - Training: 3 epochs using GPU

 -  Disabled: wandb logging

 - Skipped: Evaluation phase

 - Saved: Trained + Quantized model and tokenizer

 - Quantization: torch.quantization.Quantize_dynamic is used for efficient CPU inference