File size: 4,954 Bytes
5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd f1c10f3 d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 5d6ffbd d39b264 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
---
base_model: unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
library_name: peft
license: apache-2.0
datasets:
- garage-bAInd/Open-Platypus
language:
- en
tags:
- MATH
- LEETCODE
- text-generation-inference
- SCIENCE
---
# Model Card for SicMundus
## Model Details
### Model Description
This model, **Pinnacle**, is a fine-tuned version of `unsloth/Llama-3.2-1B-Instruct` utilizing Parameter Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation). It has been trained on the `Open-Platypus` dataset with a structured Alpaca-style prompt format. The primary goal is to enhance instruction-following capabilities while maintaining efficiency through 4-bit quantization.
- **Developed by:** Ragul
- **Funded by:** Self-funded
- **Organization:** Pinnacle Organization
- **Shared by:** Ragul
- **Model type:** Instruction-tuned Language Model
- **Language(s) (NLP):** English
- **License:** Apache 2.0 (or specify if different)
- **Finetuned from model:** `unsloth/Llama-3.2-1B-Instruct`
### Model Sources
- **Repository:** [https://huggingface.co/ragul2607/SicMundus]
- **Paper:** N/A (or link to relevant research)
- **Demo:** [Gradio, HF Spaces, etc.]
## Uses
### Direct Use
- General-purpose instruction-following tasks
- Text generation
- Code generation assistance
- Conversational AI applications
### Downstream Use
- Further fine-tuning on domain-specific datasets
- Deployment in chatbot applications
- Text summarization or document completion
### Out-of-Scope Use
- Not designed for real-time critical applications (e.g., medical or legal advice)
- May not be suitable for handling highly sensitive data
## Bias, Risks, and Limitations
While the model is designed to be a general-purpose assistant, it inherits biases from the pre-trained Llama model and the Open-Platypus dataset. Users should be aware of potential biases in generated responses, particularly regarding sensitive topics.
### Recommendations
- Use in conjunction with human oversight.
- Avoid deploying in high-stakes scenarios without additional testing.
## How to Get Started with the Model
To use the fine-tuned model, follow these steps:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = "path/to/SicMundus"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, device_map="auto")
def generate_response(prompt):
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=100)
return tokenizer.decode(output[0], skip_special_tokens=True)
prompt = "Explain the concept of reinforcement learning."
print(generate_response(prompt))
```
## Training Details
### Training Data
- **Dataset:** `garage-bAInd/Open-Platypus`
- **Preprocessing:** The dataset was formatted using Alpaca-style prompts with instruction, input, and output fields.
### Training Procedure
- **Training Framework:** Hugging Face `transformers` + `trl` (PEFT + LoRA)
- **Precision:** Mixed precision (FP16/BF16 based on hardware support)
- **Batch size:** 2 per device with gradient accumulation
- **Learning rate:** 2e-4
- **Max Steps:** 100
- **Optimizer:** AdamW 8-bit
- **LoRA Config:** Applied to key transformer layers (q_proj, k_proj, v_proj, etc.)
### Speeds, Sizes, Times
- **Checkpoint Size:** ~2GB (LoRA adapters stored separately)
- **Fine-tuning Time:** ~1 hour on A100 GPU
## Evaluation
### Testing Data, Factors & Metrics
- **Testing Data:** A subset of Open-Platypus
- **Factors:** Performance on general instruction-following tasks
- **Metrics:**
- Perplexity (PPL)
- Response Coherence
- Instruction-following accuracy
### Results
- **Perplexity:** TBD
- **Response Quality:** Qualitatively improved over base model on test prompts
## Model Examination
- **Interpretability:** Standard transformer-based behavior with LoRA fine-tuning.
- **Explainability:** Outputs can be analyzed with attention visualization tools.
## Environmental Impact
- **Hardware Type:** A100 GPU
- **Hours used:** ~1 hour
- **Cloud Provider:** Local GPU / AWS / Hugging Face Accelerate
- **Carbon Emitted:** Estimated using [Machine Learning Impact Calculator](https://mlco2.github.io/impact)
## Technical Specifications
### Model Architecture and Objective
- Transformer-based architecture (Llama-3.2-1B)
- Instruction-following optimization with PEFT-LoRA
### Compute Infrastructure
- **Hardware:** A100 (or specify if different)
- **Software:** Python, PyTorch, `transformers`, `unsloth`, `peft`
## Citation
If using this model, please cite:
```bibtex
@misc{SicMundus,
author = {Ragul},
title = {SicMundus: Fine-Tuned Llama-3.2-1B-Instruct},
year = {2025},
url = {https://huggingface.co/ragul2607/SicMundus}
}
```
## More Information
- **Contact:** [https://github.com/ragultv]
- **Further Work:** Integrate with RLHF for better alignment
## Model Card Authors
- Ragul |