bricksandbot's picture
Upload README.md with huggingface_hub
7c9a711 verified
---
license: mit
base_model: microsoft/Phi-4-mini-instruct
tags:
- phi4
- gguf
- quantized
- q4_k_m
- buildsnpper
- sap-assessor
- chatbot
- customer-support
language:
- en
pipeline_tag: text-generation
---
# Buildsnpper SAP Assessor Platform Chatbot (Q4_K_M)
Fine-tuned Phi-4-mini-instruct model for the Buildsnpper SAP Assessor Platform customer support chatbot.
## Model Details
- **Base Model**: microsoft/Phi-4-mini-instruct (3.8B parameters)
- **Fine-tuning**: LoRA (rank=16, alpha=32)
- **Format**: GGUF Q4_K_M quantized
- **Size**: ~2.5GB
- **Context Length**: 131,072 tokens
- **Training Data**: 89 Q&A pairs covering Buildsnpper platform features, workflows, and common user questions
## Use Cases
This model is specifically trained to answer questions about:
- Project and client management in Buildsnpper
- Subscription and credit system
- Platform features and navigation
- Common technical issues
- Account management
- Report generation and exports
## Usage
### With llama.cpp
```bash
# Download the model
wget https://huggingface.co/bricksandbotltd/buildsnpper-chatbot-Q4_K_M/resolve/main/buildsnpper-chatbot-Q4_K_M.gguf
# Run with llama.cpp
./llama-cli -m buildsnpper-chatbot-Q4_K_M.gguf -p "How do I create a new project in Buildsnpper?" -n 256
```
### With Python (llama-cpp-python)
```python
from llama_cpp import Llama
llm = Llama(
model_path="buildsnpper-chatbot-Q4_K_M.gguf",
n_ctx=2048,
n_threads=4
)
response = llm.create_chat_completion(
messages=[
{"role": "user", "content": "How do I assign credits to a client?"}
],
temperature=0.1,
max_tokens=256
)
print(response['choices'][0]['message']['content'])
```
## Training Details
- **LoRA Configuration**:
- Rank: 16
- Alpha: 32
- Target modules: qkv_proj, o_proj
- Dropout: 0.05
- **Training Parameters**:
- Epochs: 3
- Learning rate: 3e-4
- Max sequence length: 1024
- Gradient accumulation: 4 steps
- Final training loss: 1.42
- **Hardware**: Apple M3 MacBook Air (MPS acceleration)
- **Training time**: ~1.5 hours
## Quantization
Original FP16 model (7.67GB) was quantized to Q4_K_M format (2.5GB) using llama.cpp, achieving:
- 67% size reduction
- Optimized for CPU inference
- Minimal quality degradation
## Limitations
- Specialized for Buildsnpper platform only
- May not perform well on general queries outside the platform domain
- Designed for customer support, not general conversation
## License
MIT License - See base model license for additional restrictions.
## Contact
- Organization: [bricksandbotltd](https://huggingface.co/bricksandbotltd)
- Platform: [Buildsnpper SAP Assessor Platform](https://buildsnpper.com)