--- library_name: transformers pipeline_tag: text-generation tags: - BGPT - meta - pytorch - llama - llama-3 --- # Model Information BGPT is a finetuned version of Llama3.2-3B-Instruct, specifically optimized for generating high-quality multilingual outputs across 11 Indic languages. The model demonstrates strong capabilities in translation, summarization, and conversational tasks while maintaining the base model's performance characteristics. ## Model Developer Harsh Bande ## Model Architecture - **Base Model:** Llama3.2-3B-Instruct - **Model type:** Finetuned LLaMA (Language Model for Multilingual Text Generation) - **Architecture Type:** Auto-regressive language model with optimized transformer architecture - **Adaptation Method:** LoRA (Low-Rank Adaptation) - **Model Type:** Instruction-tuned multilingual text generation model ## Supported Languages Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu # Intended Use ## Primary Use Cases - Multilingual text generation - Cross-lingual translation - Text summarization - Conversational AI in Indic languages - Language understanding and generation tasks ## How to Get Started with the Model Make sure to update your transformers installation via `pip install --upgrade transformers`. Use the code below to get started with the model. ```python import torch from transformers import pipeline model_id = "Onkarn/ML-Test-v01" pipe = pipeline( "text-generation", model=model_id, torch_dtype=torch.bfloat16, device_map="auto", ) messages = [ {"role": "system", "content": "You are a helpful assistant who responds in hindi"}, {"role": "user", "content": "कर्नाटक की राजधानी क्या है?"}, ] outputs = pipe( messages, max_new_tokens=256, ) print(outputs[0]["generated_text"][-1]) ``` ## Training Details ### Training Data - **Dataset Composition:** Curated collection of text from 11 Indic languages - **Languages Covered:** Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu ### Training Parameters - **Optimization Technique**: LoRA (Low-Rank Adaptation) - **Epochs**: 3.0 - **Batch Size**: 2.0 (per device train batch size) - **Learning Rate**: 5e-05 ## Hardware and Environmental Impact ### Training Infrastructure - **Hardware:** T4 GPU - **Cloud Provider:** Google Cloud Platform - **Region:** asia-southeast1 - **Training Duration:** 29 hours ### Environmental Impact Assessment - **Carbon Emissions:** 0.85 kgCO₂eq - **Carbon Offset:** 100% offset by the cloud provider - **Location:** asia-southeast1 region ## Limitations and Biases - The model's performance may vary across different Indic languages - The model inherits both capabilities and limitations of the base Llama3.2-3B-Instruct model - Users should conduct appropriate testing for their specific use cases ## License [More Information Needed] ## Citation and References [More Information Needed]