Gemmarathi2 / README.md
Devavrat28's picture
update
422212f verified
---
base_model: unsloth/gemma-2-9b-bnb-4bit
library_name: peft
license: apache-2.0
language:
- en
- mr
---
# Model Card for Gemma2 7B - English to Marathi Translation
## Model Details
### Model Description
This model is a fine-tuned variant of **Unsloth's Gemma2 7B**, trained for high-quality English-to-Marathi translations. Built on a robust transformer architecture, the model handles complex translations, idiomatic expressions, and long-context paragraphs effectively. It is optimized for efficient inference using 4-bit quantization.
- **Developed by:** Devavrat Samak
- **Model type:** Causal Language Model, fine-tuned for translation tasks.
- **Language(s) (NLP):** English (en), Marathi (mr)
- **License:** Apache-2.0
- **Finetuned from model:** unsloth/gemma-2-9b-bnb-4bit
### Model Sources
- **Repository:** [https://github.com/Devsam2898/Gemma2-Marathi]
## Uses
### Direct Use
The model can be directly used for English-to-Marathi translations, including handling long-context paragraphs, noisy inputs, and code-mixed sentences.
### Downstream Use
The model can be integrated into applications for:
- Chatbots with multilingual support.
- Translating historical texts for research.
- Localization of content for Marathi-speaking audiences.
### Out-of-Scope Use
- The model is not designed for real-time, high-speed translation in latency-critical systems.
- It may not generalize well for highly domain-specific jargon without additional fine-tuning.
## Bias, Risks, and Limitations
- The model's translations might occasionally lose nuance or context in culturally significant expressions.
- Performance may degrade for noisy data or highly informal text.
### Recommendations
- Users should validate translations in sensitive domains to ensure accuracy.
- Consider additional fine-tuning for domain-specific tasks.
## How to Get Started with the Model
```python
from transformers import AutoTokenizer
from unsloth import Gemma2
# Load model and tokenizer
model = Gemma2.from_pretrained("unsloth/gemma-2-9b-bnb-4bit")
tokenizer = AutoTokenizer.from_pretrained("unsloth/gemma-2-9b-bnb-4bit")
# Input and inference
input_text = "The golden age of the Peshwas brought cultural and political prosperity to Maharashtra."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs["input_ids"], max_length=128)
translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translated_text)