Haaaaarsh
/

gemma_ac_json

Model card Files Files and versions

gemma_ac_json / README.md

Rasendra's picture

Update README.md

537544f verified over 1 year ago

|

3.02 kB

	---
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- BGPT
	- meta
	- pytorch
	- llama
	- llama-3
	---


	# Model Information

	BGPT is a finetuned version of Llama3.2-3B-Instruct, specifically optimized for generating high-quality multilingual outputs across 11 Indic languages. The model demonstrates strong capabilities in translation, summarization, and conversational tasks while maintaining the base model's performance characteristics.

	## Model Developer
	Harsh Bande

	## Model Architecture
	- Base Model: Llama3.2-3B-Instruct
	- Model type: Finetuned LLaMA (Language Model for Multilingual Text Generation)
	- Architecture Type: Auto-regressive language model with optimized transformer architecture
	- Adaptation Method: LoRA (Low-Rank Adaptation)
	- Model Type: Instruction-tuned multilingual text generation model

	## Supported Languages
	Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu

	# Intended Use

	## Primary Use Cases
	- Multilingual text generation
	- Cross-lingual translation
	- Text summarization
	- Conversational AI in Indic languages
	- Language understanding and generation tasks

	## How to Get Started with the Model

	Make sure to update your transformers installation via `pip install --upgrade transformers`.

	Use the code below to get started with the model.

	```python
	import torch
	from transformers import pipeline

	model_id = "Onkarn/ML-Test-v01"
	pipe = pipeline(
	"text-generation",
	model=model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)
	messages = [
	{"role": "system", "content": "You are a helpful assistant who responds in hindi"},
	{"role": "user", "content": "कर्नाटक की राजधानी क्या है?"},
	]
	outputs = pipe(
	messages,
	max_new_tokens=256,
	)
	print(outputs[0]["generated_text"][-1])
	```


	## Training Details

	### Training Data

	- Dataset Composition: Curated collection of text from 11 Indic languages
	- Languages Covered: Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu

	### Training Parameters

	- Optimization Technique: LoRA (Low-Rank Adaptation)
	- Epochs: 3.0
	- Batch Size: 2.0 (per device train batch size)
	- Learning Rate: 5e-05


	## Hardware and Environmental Impact

	### Training Infrastructure
	- Hardware: T4 GPU
	- Cloud Provider: Google Cloud Platform
	- Region: asia-southeast1
	- Training Duration: 29 hours

	### Environmental Impact Assessment
	- Carbon Emissions: 0.85 kgCO₂eq
	- Carbon Offset: 100% offset by the cloud provider
	- Location: asia-southeast1 region

	## Limitations and Biases
	- The model's performance may vary across different Indic languages
	- The model inherits both capabilities and limitations of the base Llama3.2-3B-Instruct model
	- Users should conduct appropriate testing for their specific use cases

	## License
	[More Information Needed]

	## Citation and References
	[More Information Needed]