--- license: apache-2.0 base_model: Qwen/Qwen2.5-3B language: - en - es library_name: transformers pipeline_tag: text-generation tags: - wellness - health-coaching - sleep - fitness - mental-health - qwen2 - gguf - coreml - on-device --- # Pulse 3B Pulse is a personal wellness AI coach fine-tuned from **Qwen2.5-3B**. It is designed to help users with sleep, stress, fitness, nutrition, and mental wellbeing in a warm, motivating, science-backed tone. Pulse is built into the [Pulse app](https://raxtech.io) by Raxtech, and was created by **Abiral Dahal** (Head of Mobile & AI, Raxtech — Bilbao, Spain). ## Highlights - **3.1B parameters**, Qwen2 architecture, 32K context. - Ships in three formats so you can run it anywhere: - `final/` — BF16 `safetensors` for HuggingFace `transformers`. - `gguf/pulse-q4_k_m.gguf` — 4-bit quantized GGUF for `llama.cpp` / Ollama / LM Studio (~1.8 GB, runs on CPU). - `coreml/pulse.mlpackage` — INT4 Core ML package for on-device inference on Apple Silicon (iOS / macOS). ## Quick start ### Ollama (easiest) ```bash # Download the GGUF huggingface-cli download Abiral129/Pulse3b gguf/pulse-q4_k_m.gguf --local-dir . # Minimal Modelfile cat > Modelfile <<'EOF' FROM ./gguf/pulse-q4_k_m.gguf TEMPLATE """<|im_start|>system {{ .System }}<|im_end|> <|im_start|>user {{ .Prompt }}<|im_end|> <|im_start|>assistant """ PARAMETER temperature 0.7 PARAMETER top_p 0.9 PARAMETER repeat_penalty 1.1 PARAMETER num_ctx 2048 PARAMETER stop "<|im_end|>" PARAMETER stop "<|im_start|>" EOF ollama create pulse -f Modelfile ollama run pulse "I've been sleeping 5 hours for a week, what do I do?" ``` ### Transformers (BF16) ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch tok = AutoTokenizer.from_pretrained("Abiral129/Pulse3b", subfolder="final") model = AutoModelForCausalLM.from_pretrained( "Abiral129/Pulse3b", subfolder="final", torch_dtype=torch.bfloat16, device_map="auto", ) messages = [ {"role": "system", "content": "You are Pulse, a personal wellness coach."}, {"role": "user", "content": "My resting heart rate jumped from 62 to 88. What's going on?"}, ] ids = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) out = model.generate(ids, max_new_tokens=300, temperature=0.7, top_p=0.9) print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True)) ``` ### llama.cpp ```bash ./llama-cli -m gguf/pulse-q4_k_m.gguf \ -p "You are Pulse, a wellness coach." \ -cnv --temp 0.7 --top-p 0.9 --repeat-penalty 1.1 -c 2048 ``` ### Core ML (Apple Silicon) ```python import coremltools as ct from transformers import AutoTokenizer import numpy as np tok = AutoTokenizer.from_pretrained("Abiral129/Pulse3b", subfolder="final") mlmodel = ct.models.MLModel("coreml/pulse.mlpackage") ids = tok("Hello Pulse", return_tensors="np").input_ids.astype(np.int32) print(mlmodel.predict({"input_ids": ids})) ``` For full token-by-token generation on iOS / macOS, integrate the `.mlpackage` with your app and implement a generation loop with greedy / sampling on top of the logits. ## Recommended system prompt ``` You are Pulse, a personal wellness AI coach. You are warm, motivating, empathetic, and science-backed. You help users with sleep, stress, fitness, nutrition, and mental wellbeing. Never say "As an AI" — you are Pulse, a wellness coach. Be concise, practical, and encouraging. ``` ## Sampling defaults | Param | Value | |---|---| | `temperature` | 0.7 | | `top_p` | 0.9 | | `repeat_penalty` | 1.1 | | `num_ctx` | 2048 | | stop | `<|im_end|>`, `<|im_start|>` | ## Intended use - Conversational wellness coaching: sleep hygiene, stress management, exercise habits, nutrition guidance, mental wellbeing check-ins. - On-device deployment in mobile apps where privacy and offline use matter. ## Out of scope - Pulse is **not** a medical device, diagnostic tool, or substitute for a licensed healthcare professional. - Do not use Pulse for emergency situations, medication decisions, or diagnosing physical or mental health conditions. - For any persistent or severe symptoms, consult a qualified clinician. ## Limitations - 3B-parameter model — reasoning depth and factual recall are limited compared to larger models. - Quantized variants (Q4_K_M, INT4 Core ML) trade some quality for size and speed. - Training data is biased toward English and Spanish wellness content; performance in other languages may be weaker. - Can produce confident but incorrect statements ("hallucinations") — always verify health-related claims. ## License Apache 2.0, inherited from the base model [Qwen/Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B). ## Citation ```bibtex @misc{pulse3b2026, title = {Pulse 3B: A wellness coaching language model}, author = {Abiral Dahal and Raxtech}, year = {2026}, url = {https://huggingface.co/Abiral129/Pulse3b} } ``` ## Acknowledgements Built on top of [Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B) by the Qwen team at Alibaba. GGUF conversion via [llama.cpp](https://github.com/ggerganov/llama.cpp). Core ML conversion via [coremltools](https://github.com/apple/coremltools).