gemma_ac_json / README.md

Rasendra

Update README.md

537544f verified over 1 year ago

preview code

raw

history blame

3.02 kB

metadata

library_name: transformers
pipeline_tag: text-generation
tags:
  - BGPT
  - meta
  - pytorch
  - llama
  - llama-3

Model Information

BGPT is a finetuned version of Llama3.2-3B-Instruct, specifically optimized for generating high-quality multilingual outputs across 11 Indic languages. The model demonstrates strong capabilities in translation, summarization, and conversational tasks while maintaining the base model's performance characteristics.

Model Developer

Harsh Bande

Model Architecture

Base Model: Llama3.2-3B-Instruct
Model type: Finetuned LLaMA (Language Model for Multilingual Text Generation)
Architecture Type: Auto-regressive language model with optimized transformer architecture
Adaptation Method: LoRA (Low-Rank Adaptation)
Model Type: Instruction-tuned multilingual text generation model

Supported Languages

Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu

Intended Use

Primary Use Cases

Multilingual text generation
Cross-lingual translation
Text summarization
Conversational AI in Indic languages
Language understanding and generation tasks

How to Get Started with the Model

Make sure to update your transformers installation via pip install --upgrade transformers.

Use the code below to get started with the model.

import torch
from transformers import pipeline

model_id = "Onkarn/ML-Test-v01"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are a helpful assistant who responds in hindi"},
    {"role": "user", "content": "कर्नाटक की राजधानी क्या है?"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Training Details

Training Data

Dataset Composition: Curated collection of text from 11 Indic languages
Languages Covered: Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu

Training Parameters

Optimization Technique: LoRA (Low-Rank Adaptation)
Epochs: 3.0
Batch Size: 2.0 (per device train batch size)
Learning Rate: 5e-05

Hardware and Environmental Impact

Training Infrastructure

Hardware: T4 GPU
Cloud Provider: Google Cloud Platform
Region: asia-southeast1
Training Duration: 29 hours

Environmental Impact Assessment

Carbon Emissions: 0.85 kgCO₂eq
Carbon Offset: 100% offset by the cloud provider
Location: asia-southeast1 region

Limitations and Biases

The model's performance may vary across different Indic languages
The model inherits both capabilities and limitations of the base Llama3.2-3B-Instruct model
Users should conduct appropriate testing for their specific use cases

License

[More Information Needed]

Citation and References

[More Information Needed]