File size: 7,170 Bytes
bb6013f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ebcc7c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
---
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- conversational
- chatbot
- casual
- bro
- unsloth
- llama
- lora
license: llama3.2
base_model: unsloth/Llama-3.2-3B-Instruct
model-index:
- name: Bro Chatbot
  results: []
datasets:
- custom
widget:
- text: "What's the meaning of life?"
  example_title: "Philosophy Question"
- text: "How do I learn Python?"
  example_title: "Learning Question"
- text: "I'm feeling stressed about work."
  example_title: "Support Question"
---

# Bro Chatbot - Fine-tuned Llama 3.2 3B

A conversational AI chatbot with a distinctive "bro" personality - casual, chilled, and supportive. Fine-tuned using Unsloth for efficient training.

## Model Details

- **Base Model**: unsloth/Llama-3.2-3B-Instruct
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) with Unsloth
- **Model Type**: Conversational AI Chatbot
- **Language**: English
- **License**: Same as base model

## Training Configuration

### Model Loading Parameters
```python
max_seq_length = 2048    # Maximum context window
dtype = None             # Auto-detect optimal precision
load_in_4bit = True      # 4-bit quantization for memory efficiency
```

### LoRA Configuration
```python
r = 16                   # LoRA rank
lora_alpha = 16          # LoRA scaling factor
lora_dropout = 0         # No dropout
target_modules = [       # Modules to adapt
    "q_proj", "k_proj", "v_proj", "o_proj",
    "gate_proj", "up_proj", "down_proj"
]
```

### Training Parameters
```python
per_device_train_batch_size = 2
gradient_accumulation_steps = 4
learning_rate = 2e-4
max_steps = 60
warmup_steps = 5
weight_decay = 0.01
optimizer = "adamw_8bit"
```

## Usage

### Installation
```bash
pip install unsloth transformers torch
```

### Loading the Model
```python
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template

# Load model with exact training parameters
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "your-username/bro-chatbot",
    max_seq_length = 2048,    # IMPORTANT: Use same as training
    dtype = None,
    load_in_4bit = True,      # IMPORTANT: Use same as training
)

# Setup chat template
tokenizer = get_chat_template(tokenizer, chat_template = "llama-3.1")

# Enable fast inference
FastLanguageModel.for_inference(model)
```

### Basic Usage
```python
def chat_with_bro(message):
    messages = [{"role": "user", "content": message}]
    inputs = tokenizer.apply_chat_template(
        messages, 
        tokenize=True, 
        add_generation_prompt=True, 
        return_tensors="pt"
    ).to("cuda")
    
    # Generate response
    outputs = model.generate(
        input_ids=inputs, 
        max_new_tokens=128,
        use_cache=True, 
        temperature=0.7, 
        min_p=0.1
    )
    
    # Extract only the new response
    input_length = inputs.shape[1]
    response = tokenizer.decode(outputs[0][input_length:], skip_special_tokens=True)
    return response

# Example usage
response = chat_with_bro("How do I learn Python?")
print(response)
```

### Streaming Usage
```python
from transformers import TextStreamer

def chat_with_bro_streaming(message):
    messages = [{"role": "user", "content": message}]
    inputs = tokenizer.apply_chat_template(
        messages, 
        tokenize=True, 
        add_generation_prompt=True, 
        return_tensors="pt"
    ).to("cuda")
    
    # Stream response in real-time
    text_streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
    model.generate(
        input_ids=inputs, 
        streamer=text_streamer, 
        max_new_tokens=128,
        use_cache=True, 
        temperature=0.7, 
        min_p=0.1
    )

# Example usage
chat_with_bro_streaming("What's the meaning of life?")
```

## Personality Traits

The model exhibits these characteristics:
- **Casual**: Uses relaxed language, slang, and informal expressions
- **Chilled**: Maintains a calm, easy-going attitude in all responses  
- **Supportive**: Always encouraging and positive, offering help and motivation
- **Friendly**: Uses terms like "bro", "dude", "man" in conversations
- **Helpful**: Provides useful information while maintaining the casual tone

## Training Data

- **Dataset Size**: 57 conversation pairs
- **Data Format**: User-assistant conversations in JSON format
- **Topics Covered**: 
  - General knowledge questions
  - Personal advice and motivation
  - Technical explanations
  - Casual conversations
  - Problem-solving scenarios

## Example Conversations

**User**: "What is the capital of France?"
**Bro Bot**: "Yo bro! The capital of France is Paris, dude. Beautiful city with the Eiffel Tower and all that good stuff. You thinking of visiting or just curious? Either way, you got this!"

**User**: "I'm feeling really stressed about my job interview tomorrow."
**Bro Bot**: "Hey man, I totally get that stress - interviews can be nerve-wracking! But listen bro, you wouldn't have gotten the interview if they didn't see something awesome in you already. Just be yourself, take some deep breaths, and remember you're gonna crush it. You got the skills, now just show 'em that chill confidence!"

## Hardware Requirements

### Minimum Requirements
- **GPU**: 8GB VRAM (RTX 3070, RTX 4060 Ti, or equivalent)
- **RAM**: 16GB system RAM
- **Storage**: 10GB free space

### Recommended Requirements  
- **GPU**: 16GB+ VRAM (RTX 4080, RTX 4090, or equivalent)
- **RAM**: 32GB system RAM
- **Storage**: 20GB+ free space

## Performance Notes

- **Inference Speed**: ~2x faster with Unsloth optimizations
- **Memory Usage**: ~75% reduction with 4-bit quantization
- **Context Window**: 2048 tokens maximum
- **Response Quality**: Maintains personality while being informative

## Limitations

- Trained on a relatively small dataset (57 examples)
- May occasionally break character in complex technical discussions
- Limited to English language conversations
- Context window limited to 2048 tokens
- Requires GPU for optimal performance

## Technical Details

### Model Architecture
- **Base**: Llama 3.2 3B parameters
- **Adaptation**: LoRA with rank 16
- **Quantization**: 4-bit using bitsandbytes
- **Chat Template**: Llama 3.1 format

### Training Infrastructure
- **Framework**: Unsloth + TRL (Transformers Reinforcement Learning)
- **Optimization**: AdamW 8-bit optimizer
- **Memory**: Gradient checkpointing enabled
- **Platform**: Tested on Windows and Linux

## Citation

If you use this model, please cite:

```bibtex
@misc{bro-chatbot-2024,
  title={Bro Chatbot: A Casual Conversational AI},
  author={Your Name},
  year={2024},
  howpublished={HuggingFace Model Hub},
  url={https://huggingface.co/your-username/bro-chatbot}
}
```

## Acknowledgments

- **Unsloth**: For efficient fine-tuning framework
- **Meta**: For the Llama 3.2 base model
- **HuggingFace**: For model hosting and transformers library

## Contact

For questions or issues, please open an issue on the model repository or contact [your-email@example.com].

---

**Note**: This model is for research and educational purposes. Please use responsibly and be aware of potential biases in AI-generated content.