File size: 4,954 Bytes

5d6ffbd
 
 
d39b264
 
 
 
 
 
 
 
 
 
5d6ffbd
 
d39b264
5d6ffbd
 
 
 
f1c10f3
d39b264
 
 
 
 
 
 
 
 
 
 
 
 
 
5d6ffbd
 
 
 
d39b264
 
 
 
5d6ffbd
d39b264
 
 
 
5d6ffbd
 
d39b264
 
5d6ffbd
 
 
d39b264
5d6ffbd
 
d39b264
 
5d6ffbd
d39b264
5d6ffbd
d39b264
5d6ffbd
d39b264
 
 
 
 
 
 
5d6ffbd
d39b264
 
 
 
5d6ffbd
d39b264
 
 
5d6ffbd
 
 
 
d39b264
 
5d6ffbd
 
d39b264
 
 
 
 
 
 
 
 
 
 
5d6ffbd
 
 
 
d39b264
 
 
 
 
 
5d6ffbd
 
d39b264
 
5d6ffbd
d39b264
 
 
5d6ffbd
 
d39b264
 
 
 
5d6ffbd
d39b264
5d6ffbd
 
d39b264
 
5d6ffbd
 
d39b264

---
base_model: unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
library_name: peft
license: apache-2.0
datasets:
- garage-bAInd/Open-Platypus
language:
- en
tags:
- MATH
- LEETCODE
- text-generation-inference
- SCIENCE
---

# Model Card for SicMundus

## Model Details

### Model Description
This model, **Pinnacle**, is a fine-tuned version of `unsloth/Llama-3.2-1B-Instruct` utilizing Parameter Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation). It has been trained on the `Open-Platypus` dataset with a structured Alpaca-style prompt format. The primary goal is to enhance instruction-following capabilities while maintaining efficiency through 4-bit quantization.

- **Developed by:** Ragul
- **Funded by:** Self-funded
- **Organization:** Pinnacle Organization
- **Shared by:** Ragul
- **Model type:** Instruction-tuned Language Model
- **Language(s) (NLP):** English
- **License:** Apache 2.0 (or specify if different)
- **Finetuned from model:** `unsloth/Llama-3.2-1B-Instruct`

### Model Sources
- **Repository:** [https://huggingface.co/ragul2607/SicMundus]
- **Paper:** N/A (or link to relevant research)
- **Demo:** [Gradio, HF Spaces, etc.]

## Uses

### Direct Use
- General-purpose instruction-following tasks
- Text generation
- Code generation assistance
- Conversational AI applications

### Downstream Use
- Further fine-tuning on domain-specific datasets
- Deployment in chatbot applications
- Text summarization or document completion

### Out-of-Scope Use
- Not designed for real-time critical applications (e.g., medical or legal advice)
- May not be suitable for handling highly sensitive data

## Bias, Risks, and Limitations

While the model is designed to be a general-purpose assistant, it inherits biases from the pre-trained Llama model and the Open-Platypus dataset. Users should be aware of potential biases in generated responses, particularly regarding sensitive topics.

### Recommendations
- Use in conjunction with human oversight.
- Avoid deploying in high-stakes scenarios without additional testing.

## How to Get Started with the Model

To use the fine-tuned model, follow these steps:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "path/to/SicMundus"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, device_map="auto")

def generate_response(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    output = model.generate(**inputs, max_new_tokens=100)
    return tokenizer.decode(output[0], skip_special_tokens=True)

prompt = "Explain the concept of reinforcement learning."
print(generate_response(prompt))
```

## Training Details

### Training Data
- **Dataset:** `garage-bAInd/Open-Platypus`
- **Preprocessing:** The dataset was formatted using Alpaca-style prompts with instruction, input, and output fields.

### Training Procedure
- **Training Framework:** Hugging Face `transformers` + `trl` (PEFT + LoRA)
- **Precision:** Mixed precision (FP16/BF16 based on hardware support)
- **Batch size:** 2 per device with gradient accumulation
- **Learning rate:** 2e-4
- **Max Steps:** 100
- **Optimizer:** AdamW 8-bit
- **LoRA Config:** Applied to key transformer layers (q_proj, k_proj, v_proj, etc.)

### Speeds, Sizes, Times
- **Checkpoint Size:** ~2GB (LoRA adapters stored separately)
- **Fine-tuning Time:** ~1 hour on A100 GPU

## Evaluation

### Testing Data, Factors & Metrics
- **Testing Data:** A subset of Open-Platypus
- **Factors:** Performance on general instruction-following tasks
- **Metrics:**
  - Perplexity (PPL)
  - Response Coherence
  - Instruction-following accuracy

### Results
- **Perplexity:** TBD
- **Response Quality:** Qualitatively improved over base model on test prompts

## Model Examination
- **Interpretability:** Standard transformer-based behavior with LoRA fine-tuning.
- **Explainability:** Outputs can be analyzed with attention visualization tools.

## Environmental Impact
- **Hardware Type:** A100 GPU
- **Hours used:** ~1 hour
- **Cloud Provider:** Local GPU / AWS / Hugging Face Accelerate
- **Carbon Emitted:** Estimated using [Machine Learning Impact Calculator](https://mlco2.github.io/impact)

## Technical Specifications

### Model Architecture and Objective
- Transformer-based architecture (Llama-3.2-1B)
- Instruction-following optimization with PEFT-LoRA

### Compute Infrastructure
- **Hardware:** A100 (or specify if different)
- **Software:** Python, PyTorch, `transformers`, `unsloth`, `peft`

## Citation
If using this model, please cite:

```bibtex
@misc{SicMundus,
  author = {Ragul},
  title = {SicMundus: Fine-Tuned Llama-3.2-1B-Instruct},
  year = {2025},
  url = {https://huggingface.co/ragul2607/SicMundus}
}
```

## More Information
- **Contact:** [https://github.com/ragultv]
- **Further Work:** Integrate with RLHF for better alignment

## Model Card Authors
- Ragul