Instructions to use lebiraja/claritymentor-lora-failed with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use lebiraja/claritymentor-lora-failed with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "lebiraja/claritymentor-lora-failed") - Transformers
How to use lebiraja/claritymentor-lora-failed with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="lebiraja/claritymentor-lora-failed") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("lebiraja/claritymentor-lora-failed", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use lebiraja/claritymentor-lora-failed with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lebiraja/claritymentor-lora-failed" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lebiraja/claritymentor-lora-failed", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/lebiraja/claritymentor-lora-failed
- SGLang
How to use lebiraja/claritymentor-lora-failed with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "lebiraja/claritymentor-lora-failed" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lebiraja/claritymentor-lora-failed", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "lebiraja/claritymentor-lora-failed" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lebiraja/claritymentor-lora-failed", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use lebiraja/claritymentor-lora-failed with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for lebiraja/claritymentor-lora-failed to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for lebiraja/claritymentor-lora-failed to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for lebiraja/claritymentor-lora-failed to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="lebiraja/claritymentor-lora-failed", max_seq_length=2048, ) - Docker Model Runner
How to use lebiraja/claritymentor-lora-failed with Docker Model Runner:
docker model run hf.co/lebiraja/claritymentor-lora-failed
ClarityMentor LoRA (Failed Experiment) β
β οΈ STATUS: FAILED - This model demonstrates what NOT to do when fine-tuning a life mentor LLM.
What This Is
A QLoRA fine-tune of Qwen/Qwen2.5-1.5B-Instruct intended to be a philosophical life mentor called "ClarityMentor".
Why It Failed
The model only asks questions regardless of user input. Even when asked "what is apple" or "hello", it responds with philosophical clarifying questions.
Root Causes:
- Overweighted question-asking data: The
conversation_startersdataset (2,999 samples) was designed to train Socratic questioning behavior - Imbalanced training mix: Too many datasets emphasized asking clarifying questions
- System prompt mismatch: Training data used a question-heavy system prompt, but we wanted direct answers
- Overfitting to patterns: Model memorized a narrow set of question templates instead of learning diverse responses
Evidence of Failure:
User: hello
Model: What does success look like to you personally?
User: what is apple
Model: What makes certain decisions feel harder than others for you?
User: give me python code
Model: What specifically draws you to this question right now?
User: this world shall burn to ashes
Model: What does success look like to you personally?
The model ignores user input entirely and outputs memorized question templates.
Training Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-1.5B-Instruct |
| Method | QLoRA (4-bit NF4 quantization) |
| Hardware | NVIDIA RTX 4050 Laptop GPU (6GB VRAM) |
| Training Time | ~3 hours |
| Training Samples | 27,362 |
| Eval Samples | 1,440 |
| Epochs | 2 |
| Final Train Loss | 0.789 |
| Final Eval Loss | 0.68 |
Dataset Mix (the problem):
| Dataset | Samples | Issue |
|---|---|---|
| Philosophy QA | 11,613 | OK - direct Q&A |
| Counseling | ~10,000 | Question-heavy responses |
| ~10,000 | OK - varied | |
| Quotes | 4,995 | OK - wisdom-based |
| Conversation Starters | 2,999 | π΄ Main culprit - trained ONLY to ask questions |
LoRA Config:
r: 16
lora_alpha: 32
lora_dropout: 0
target_modules:
- q_proj
- k_proj
- v_proj
- o_proj
- gate_proj
- up_proj
- down_proj
max_seq_length: 512
Training Args:
per_device_train_batch_size: 1
gradient_accumulation_steps: 16
learning_rate: 2e-4
lr_scheduler_type: cosine
warmup_ratio: 0.03
bf16: true
optim: paged_adamw_8bit
Lessons Learned
- Audit your training data - The conversation_starters dataset was specifically designed to output clarifying questions, which dominated the learned behavior
- Balance response types - Need diverse outputs (questions, answers, advice) not just one pattern
- System prompt matters during training - Should match intended inference behavior
- Test incrementally - Should have validated on a small sample before full training
- Watch for overfitting patterns - Low loss doesn't mean good model behavior
How to Fix (for future attempts)
- β Remove or significantly reduce conversation_starters dataset
- β Add more datasets with direct, substantive answers
- β Include the "no questions" system prompt in training data
- β Better balance between question-asking and answer-giving examples
- β Add response diversity validation before training
Usage (not recommended, but here's how)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
"YOUR_USERNAME/claritymentor-lora-failed",
max_seq_length=512,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
messages = [
{"role": "system", "content": "You are ClarityMentor, a philosophical mentor."},
{"role": "user", "content": "What is the meaning of life?"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: Will ask you questions instead of answering...
Framework Versions
- PEFT: 0.18.1
- Transformers: 4.57.3
- Unsloth: 2026.1.3
- PyTorch: 2.9.1+cu128
License
Apache 2.0
This model is shared as a learning resource. Sometimes failures teach more than successes.
- Downloads last month
- -