gemma_ac_json / README.md
Rasendra's picture
Update README.md
537544f verified
|
raw
history blame
3.02 kB
---
library_name: transformers
pipeline_tag: text-generation
tags:
- BGPT
- meta
- pytorch
- llama
- llama-3
---
# Model Information
BGPT is a finetuned version of Llama3.2-3B-Instruct, specifically optimized for generating high-quality multilingual outputs across 11 Indic languages. The model demonstrates strong capabilities in translation, summarization, and conversational tasks while maintaining the base model's performance characteristics.
## Model Developer
Harsh Bande
## Model Architecture
- **Base Model:** Llama3.2-3B-Instruct
- **Model type:** Finetuned LLaMA (Language Model for Multilingual Text Generation)
- **Architecture Type:** Auto-regressive language model with optimized transformer architecture
- **Adaptation Method:** LoRA (Low-Rank Adaptation)
- **Model Type:** Instruction-tuned multilingual text generation model
## Supported Languages
Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu
# Intended Use
## Primary Use Cases
- Multilingual text generation
- Cross-lingual translation
- Text summarization
- Conversational AI in Indic languages
- Language understanding and generation tasks
## How to Get Started with the Model
Make sure to update your transformers installation via `pip install --upgrade transformers`.
Use the code below to get started with the model.
```python
import torch
from transformers import pipeline
model_id = "Onkarn/ML-Test-v01"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a helpful assistant who responds in hindi"},
{"role": "user", "content": "कर्नाटक की राजधानी क्या है?"},
]
outputs = pipe(
messages,
max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
```
## Training Details
### Training Data
- **Dataset Composition:** Curated collection of text from 11 Indic languages
- **Languages Covered:** Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu
### Training Parameters
- **Optimization Technique**: LoRA (Low-Rank Adaptation)
- **Epochs**: 3.0
- **Batch Size**: 2.0 (per device train batch size)
- **Learning Rate**: 5e-05
## Hardware and Environmental Impact
### Training Infrastructure
- **Hardware:** T4 GPU
- **Cloud Provider:** Google Cloud Platform
- **Region:** asia-southeast1
- **Training Duration:** 29 hours
### Environmental Impact Assessment
- **Carbon Emissions:** 0.85 kgCO₂eq
- **Carbon Offset:** 100% offset by the cloud provider
- **Location:** asia-southeast1 region
## Limitations and Biases
- The model's performance may vary across different Indic languages
- The model inherits both capabilities and limitations of the base Llama3.2-3B-Instruct model
- Users should conduct appropriate testing for their specific use cases
## License
[More Information Needed]
## Citation and References
[More Information Needed]