File size: 2,690 Bytes
7c9a711
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
license: mit
base_model: microsoft/Phi-4-mini-instruct
tags:
- phi4
- gguf
- quantized
- q4_k_m
- buildsnpper
- sap-assessor
- chatbot
- customer-support
language:
- en
pipeline_tag: text-generation
---

# Buildsnpper SAP Assessor Platform Chatbot (Q4_K_M)

Fine-tuned Phi-4-mini-instruct model for the Buildsnpper SAP Assessor Platform customer support chatbot.

## Model Details

- **Base Model**: microsoft/Phi-4-mini-instruct (3.8B parameters)
- **Fine-tuning**: LoRA (rank=16, alpha=32)
- **Format**: GGUF Q4_K_M quantized
- **Size**: ~2.5GB
- **Context Length**: 131,072 tokens
- **Training Data**: 89 Q&A pairs covering Buildsnpper platform features, workflows, and common user questions

## Use Cases

This model is specifically trained to answer questions about:
- Project and client management in Buildsnpper
- Subscription and credit system
- Platform features and navigation
- Common technical issues
- Account management
- Report generation and exports

## Usage

### With llama.cpp

```bash
# Download the model
wget https://huggingface.co/bricksandbotltd/buildsnpper-chatbot-Q4_K_M/resolve/main/buildsnpper-chatbot-Q4_K_M.gguf

# Run with llama.cpp
./llama-cli -m buildsnpper-chatbot-Q4_K_M.gguf -p "How do I create a new project in Buildsnpper?" -n 256
```

### With Python (llama-cpp-python)

```python
from llama_cpp import Llama

llm = Llama(
    model_path="buildsnpper-chatbot-Q4_K_M.gguf",
    n_ctx=2048,
    n_threads=4
)

response = llm.create_chat_completion(
    messages=[
        {"role": "user", "content": "How do I assign credits to a client?"}
    ],
    temperature=0.1,
    max_tokens=256
)

print(response['choices'][0]['message']['content'])
```

## Training Details

- **LoRA Configuration**:
  - Rank: 16
  - Alpha: 32
  - Target modules: qkv_proj, o_proj
  - Dropout: 0.05

- **Training Parameters**:
  - Epochs: 3
  - Learning rate: 3e-4
  - Max sequence length: 1024
  - Gradient accumulation: 4 steps
  - Final training loss: 1.42

- **Hardware**: Apple M3 MacBook Air (MPS acceleration)
- **Training time**: ~1.5 hours

## Quantization

Original FP16 model (7.67GB) was quantized to Q4_K_M format (2.5GB) using llama.cpp, achieving:
- 67% size reduction
- Optimized for CPU inference
- Minimal quality degradation

## Limitations

- Specialized for Buildsnpper platform only
- May not perform well on general queries outside the platform domain
- Designed for customer support, not general conversation

## License

MIT License - See base model license for additional restrictions.

## Contact

- Organization: [bricksandbotltd](https://huggingface.co/bricksandbotltd)
- Platform: [Buildsnpper SAP Assessor Platform](https://buildsnpper.com)