Haaaaarsh
/

gemma_ac_json

Safetensors

Model card Files Files and versions

xet

Community

Update README.md

by Rasendra - opened Nov 20, 2024

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+45

-13

Files changed (1) hide show

README.md +45 -13

README.md CHANGED Viewed

@@ -10,14 +10,31 @@ tags:
 ---
-### Model Description
-This model is a finetuned version of Llama3.2-3B-Instruct specifically designed for generating multilingual outputs across multiple Indic languages. The model has been trained on a diverse and curated dataset comprising Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu. It is optimized to handle natural language tasks such as translation, summarization, and conversational generation across these languages effectively.
-- **Developed by:** [More Information Needed]
 - **Model type:** Finetuned LLaMA (Language Model for Multilingual Text Generation)
-- **Language(s) (NLP):** Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, Telugu
-- **Finetuned from model:** Llama3.2-3B-Instruct
 ## How to Get Started with the Model
@@ -52,9 +69,8 @@ print(outputs[0]["generated_text"][-1])
 ### Training Data
-The training dataset included a diverse collection of text sources in:
-- Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu.
 ### Training Parameters
@@ -64,11 +80,27 @@ The training dataset included a diverse collection of text sources in:
 - **Learning Rate**: 5e-05
-## Environmental Impact
-- **Hardware Type:** T4
-- **Hours used:** 29 hours
 - **Cloud Provider:** Google Cloud Platform
-- **Compute Region:** asia-southeast1
-- **Carbon Emitted:** Total emissions are estimated to be 0.85 kgCO$_2$eq of which 100 percents were directly offset by the cloud provider.

 ---
+# Model Information
+BGPT is a finetuned version of Llama3.2-3B-Instruct, specifically optimized for generating high-quality multilingual outputs across 11 Indic languages. The model demonstrates strong capabilities in translation, summarization, and conversational tasks while maintaining the base model's performance characteristics.
+## Model Developer
+Harsh Bande
+## Model Architecture
+- **Base Model:** Llama3.2-3B-Instruct
 - **Model type:** Finetuned LLaMA (Language Model for Multilingual Text Generation)
+- **Architecture Type:** Auto-regressive language model with optimized transformer architecture
+- **Adaptation Method:** LoRA (Low-Rank Adaptation)
+- **Model Type:** Instruction-tuned multilingual text generation model
+## Supported Languages
+Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu
+# Intended Use
+## Primary Use Cases
+- Multilingual text generation
+- Cross-lingual translation
+- Text summarization
+- Conversational AI in Indic languages
+- Language understanding and generation tasks
 ## How to Get Started with the Model
 ### Training Data
+- **Dataset Composition:** Curated collection of text from 11 Indic languages
+- **Languages Covered:** Hindi, Punjabi, Marathi, Malayalam, Oriya, Kannada, Gujarati, Bengali, Urdu, Tamil, and Telugu
 ### Training Parameters
 - **Learning Rate**: 5e-05
+## Hardware and Environmental Impact
+### Training Infrastructure
+- **Hardware:** T4 GPU
 - **Cloud Provider:** Google Cloud Platform
+- **Region:** asia-southeast1
+- **Training Duration:** 29 hours
+### Environmental Impact Assessment
+- **Carbon Emissions:** 0.85 kgCO₂eq
+- **Carbon Offset:** 100% offset by the cloud provider
+- **Location:** asia-southeast1 region
+## Limitations and Biases
+- The model's performance may vary across different Indic languages
+- The model inherits both capabilities and limitations of the base Llama3.2-3B-Instruct model
+- Users should conduct appropriate testing for their specific use cases
+## License
+[More Information Needed]
+## Citation and References
+[More Information Needed]