Pulse3b / README.md
Abiral129's picture
Initial upload: BF16 safetensors + Q4_K_M GGUF + Core ML mlpackage
e0ceb81 verified
---
license: apache-2.0
base_model: Qwen/Qwen2.5-3B
language:
- en
- es
library_name: transformers
pipeline_tag: text-generation
tags:
- wellness
- health-coaching
- sleep
- fitness
- mental-health
- qwen2
- gguf
- coreml
- on-device
---
# Pulse 3B
Pulse is a personal wellness AI coach fine-tuned from **Qwen2.5-3B**. It is designed to help users with sleep, stress, fitness, nutrition, and mental wellbeing in a warm, motivating, science-backed tone.
Pulse is built into the [Pulse app](https://raxtech.io) by Raxtech, and was created by **Abiral Dahal** (Head of Mobile & AI, Raxtech — Bilbao, Spain).
## Highlights
- **3.1B parameters**, Qwen2 architecture, 32K context.
- Ships in three formats so you can run it anywhere:
- `final/` — BF16 `safetensors` for HuggingFace `transformers`.
- `gguf/pulse-q4_k_m.gguf` — 4-bit quantized GGUF for `llama.cpp` / Ollama / LM Studio (~1.8 GB, runs on CPU).
- `coreml/pulse.mlpackage` — INT4 Core ML package for on-device inference on Apple Silicon (iOS / macOS).
## Quick start
### Ollama (easiest)
```bash
# Download the GGUF
huggingface-cli download Abiral129/Pulse3b gguf/pulse-q4_k_m.gguf --local-dir .
# Minimal Modelfile
cat > Modelfile <<'EOF'
FROM ./gguf/pulse-q4_k_m.gguf
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.1
PARAMETER num_ctx 2048
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
EOF
ollama create pulse -f Modelfile
ollama run pulse "I've been sleeping 5 hours for a week, what do I do?"
```
### Transformers (BF16)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tok = AutoTokenizer.from_pretrained("Abiral129/Pulse3b", subfolder="final")
model = AutoModelForCausalLM.from_pretrained(
"Abiral129/Pulse3b",
subfolder="final",
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are Pulse, a personal wellness coach."},
{"role": "user", "content": "My resting heart rate jumped from 62 to 88. What's going on?"},
]
ids = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(ids, max_new_tokens=300, temperature=0.7, top_p=0.9)
print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))
```
### llama.cpp
```bash
./llama-cli -m gguf/pulse-q4_k_m.gguf \
-p "You are Pulse, a wellness coach." \
-cnv --temp 0.7 --top-p 0.9 --repeat-penalty 1.1 -c 2048
```
### Core ML (Apple Silicon)
```python
import coremltools as ct
from transformers import AutoTokenizer
import numpy as np
tok = AutoTokenizer.from_pretrained("Abiral129/Pulse3b", subfolder="final")
mlmodel = ct.models.MLModel("coreml/pulse.mlpackage")
ids = tok("Hello Pulse", return_tensors="np").input_ids.astype(np.int32)
print(mlmodel.predict({"input_ids": ids}))
```
For full token-by-token generation on iOS / macOS, integrate the `.mlpackage` with your app and implement a generation loop with greedy / sampling on top of the logits.
## Recommended system prompt
```
You are Pulse, a personal wellness AI coach. You are warm, motivating, empathetic, and science-backed. You help users with sleep, stress, fitness, nutrition, and mental wellbeing. Never say "As an AI" — you are Pulse, a wellness coach. Be concise, practical, and encouraging.
```
## Sampling defaults
| Param | Value |
|---|---|
| `temperature` | 0.7 |
| `top_p` | 0.9 |
| `repeat_penalty` | 1.1 |
| `num_ctx` | 2048 |
| stop | `<|im_end|>`, `<|im_start|>` |
## Intended use
- Conversational wellness coaching: sleep hygiene, stress management, exercise habits, nutrition guidance, mental wellbeing check-ins.
- On-device deployment in mobile apps where privacy and offline use matter.
## Out of scope
- Pulse is **not** a medical device, diagnostic tool, or substitute for a licensed healthcare professional.
- Do not use Pulse for emergency situations, medication decisions, or diagnosing physical or mental health conditions.
- For any persistent or severe symptoms, consult a qualified clinician.
## Limitations
- 3B-parameter model — reasoning depth and factual recall are limited compared to larger models.
- Quantized variants (Q4_K_M, INT4 Core ML) trade some quality for size and speed.
- Training data is biased toward English and Spanish wellness content; performance in other languages may be weaker.
- Can produce confident but incorrect statements ("hallucinations") — always verify health-related claims.
## License
Apache 2.0, inherited from the base model [Qwen/Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B).
## Citation
```bibtex
@misc{pulse3b2026,
title = {Pulse 3B: A wellness coaching language model},
author = {Abiral Dahal and Raxtech},
year = {2026},
url = {https://huggingface.co/Abiral129/Pulse3b}
}
```
## Acknowledgements
Built on top of [Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B) by the Qwen team at Alibaba. GGUF conversion via [llama.cpp](https://github.com/ggerganov/llama.cpp). Core ML conversion via [coremltools](https://github.com/apple/coremltools).