---
license: apache-2.0
base_model: Qwen/Qwen2.5-3B
language:
- en
- es
library_name: transformers
pipeline_tag: text-generation
tags:
- wellness
- health-coaching
- sleep
- fitness
- mental-health
- qwen2
- gguf
- coreml
- on-device
---

# Pulse 3B

Pulse is a personal wellness AI coach fine-tuned from **Qwen2.5-3B**. It is designed to help users with sleep, stress, fitness, nutrition, and mental wellbeing in a warm, motivating, science-backed tone.

Pulse is built into the [Pulse app](https://raxtech.io) by Raxtech, and was created by **Abiral Dahal** (Head of Mobile & AI, Raxtech — Bilbao, Spain).

## Highlights

- **3.1B parameters**, Qwen2 architecture, 32K context.
- Ships in three formats so you can run it anywhere:
  - `final/` — BF16 `safetensors` for HuggingFace `transformers`.
  - `gguf/pulse-q4_k_m.gguf` — 4-bit quantized GGUF for `llama.cpp` / Ollama / LM Studio (~1.8 GB, runs on CPU).
  - `coreml/pulse.mlpackage` — INT4 Core ML package for on-device inference on Apple Silicon (iOS / macOS).

## Quick start

### Ollama (easiest)

```bash
# Download the GGUF
huggingface-cli download Abiral129/Pulse3b gguf/pulse-q4_k_m.gguf --local-dir .

# Minimal Modelfile
cat > Modelfile <<'EOF'
FROM ./gguf/pulse-q4_k_m.gguf
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.1
PARAMETER num_ctx 2048
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
EOF

ollama create pulse -f Modelfile
ollama run pulse "I've been sleeping 5 hours for a week, what do I do?"
```

### Transformers (BF16)

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tok = AutoTokenizer.from_pretrained("Abiral129/Pulse3b", subfolder="final")
model = AutoModelForCausalLM.from_pretrained(
    "Abiral129/Pulse3b",
    subfolder="final",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are Pulse, a personal wellness coach."},
    {"role": "user", "content": "My resting heart rate jumped from 62 to 88. What's going on?"},
]
ids = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(ids, max_new_tokens=300, temperature=0.7, top_p=0.9)
print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))
```

### llama.cpp

```bash
./llama-cli -m gguf/pulse-q4_k_m.gguf \
  -p "You are Pulse, a wellness coach." \
  -cnv --temp 0.7 --top-p 0.9 --repeat-penalty 1.1 -c 2048
```

### Core ML (Apple Silicon)

```python
import coremltools as ct
from transformers import AutoTokenizer
import numpy as np

tok = AutoTokenizer.from_pretrained("Abiral129/Pulse3b", subfolder="final")
mlmodel = ct.models.MLModel("coreml/pulse.mlpackage")
ids = tok("Hello Pulse", return_tensors="np").input_ids.astype(np.int32)
print(mlmodel.predict({"input_ids": ids}))
```

For full token-by-token generation on iOS / macOS, integrate the `.mlpackage` with your app and implement a generation loop with greedy / sampling on top of the logits.

## Recommended system prompt

```
You are Pulse, a personal wellness AI coach. You are warm, motivating, empathetic, and science-backed. You help users with sleep, stress, fitness, nutrition, and mental wellbeing. Never say "As an AI" — you are Pulse, a wellness coach. Be concise, practical, and encouraging.
```

## Sampling defaults

| Param | Value |
|---|---|
| `temperature` | 0.7 |
| `top_p` | 0.9 |
| `repeat_penalty` | 1.1 |
| `num_ctx` | 2048 |
| stop | `<|im_end|>`, `<|im_start|>` |

## Intended use

- Conversational wellness coaching: sleep hygiene, stress management, exercise habits, nutrition guidance, mental wellbeing check-ins.
- On-device deployment in mobile apps where privacy and offline use matter.

## Out of scope

- Pulse is **not** a medical device, diagnostic tool, or substitute for a licensed healthcare professional.
- Do not use Pulse for emergency situations, medication decisions, or diagnosing physical or mental health conditions.
- For any persistent or severe symptoms, consult a qualified clinician.

## Limitations

- 3B-parameter model — reasoning depth and factual recall are limited compared to larger models.
- Quantized variants (Q4_K_M, INT4 Core ML) trade some quality for size and speed.
- Training data is biased toward English and Spanish wellness content; performance in other languages may be weaker.
- Can produce confident but incorrect statements ("hallucinations") — always verify health-related claims.

## License

Apache 2.0, inherited from the base model [Qwen/Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B).

## Citation

```bibtex
@misc{pulse3b2026,
  title  = {Pulse 3B: A wellness coaching language model},
  author = {Abiral Dahal and Raxtech},
  year   = {2026},
  url    = {https://huggingface.co/Abiral129/Pulse3b}
}
```

## Acknowledgements

Built on top of [Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B) by the Qwen team at Alibaba. GGUF conversion via [llama.cpp](https://github.com/ggerganov/llama.cpp). Core ML conversion via [coremltools](https://github.com/apple/coremltools).