Pulse3b / README.md
Abiral129's picture
Initial upload: BF16 safetensors + Q4_K_M GGUF + Core ML mlpackage
e0ceb81 verified
metadata
license: apache-2.0
base_model: Qwen/Qwen2.5-3B
language:
  - en
  - es
library_name: transformers
pipeline_tag: text-generation
tags:
  - wellness
  - health-coaching
  - sleep
  - fitness
  - mental-health
  - qwen2
  - gguf
  - coreml
  - on-device

Pulse 3B

Pulse is a personal wellness AI coach fine-tuned from Qwen2.5-3B. It is designed to help users with sleep, stress, fitness, nutrition, and mental wellbeing in a warm, motivating, science-backed tone.

Pulse is built into the Pulse app by Raxtech, and was created by Abiral Dahal (Head of Mobile & AI, Raxtech — Bilbao, Spain).

Highlights

  • 3.1B parameters, Qwen2 architecture, 32K context.
  • Ships in three formats so you can run it anywhere:
    • final/ — BF16 safetensors for HuggingFace transformers.
    • gguf/pulse-q4_k_m.gguf — 4-bit quantized GGUF for llama.cpp / Ollama / LM Studio (~1.8 GB, runs on CPU).
    • coreml/pulse.mlpackage — INT4 Core ML package for on-device inference on Apple Silicon (iOS / macOS).

Quick start

Ollama (easiest)

# Download the GGUF
huggingface-cli download Abiral129/Pulse3b gguf/pulse-q4_k_m.gguf --local-dir .

# Minimal Modelfile
cat > Modelfile <<'EOF'
FROM ./gguf/pulse-q4_k_m.gguf
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.1
PARAMETER num_ctx 2048
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
EOF

ollama create pulse -f Modelfile
ollama run pulse "I've been sleeping 5 hours for a week, what do I do?"

Transformers (BF16)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tok = AutoTokenizer.from_pretrained("Abiral129/Pulse3b", subfolder="final")
model = AutoModelForCausalLM.from_pretrained(
    "Abiral129/Pulse3b",
    subfolder="final",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are Pulse, a personal wellness coach."},
    {"role": "user", "content": "My resting heart rate jumped from 62 to 88. What's going on?"},
]
ids = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(ids, max_new_tokens=300, temperature=0.7, top_p=0.9)
print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))

llama.cpp

./llama-cli -m gguf/pulse-q4_k_m.gguf \
  -p "You are Pulse, a wellness coach." \
  -cnv --temp 0.7 --top-p 0.9 --repeat-penalty 1.1 -c 2048

Core ML (Apple Silicon)

import coremltools as ct
from transformers import AutoTokenizer
import numpy as np

tok = AutoTokenizer.from_pretrained("Abiral129/Pulse3b", subfolder="final")
mlmodel = ct.models.MLModel("coreml/pulse.mlpackage")
ids = tok("Hello Pulse", return_tensors="np").input_ids.astype(np.int32)
print(mlmodel.predict({"input_ids": ids}))

For full token-by-token generation on iOS / macOS, integrate the .mlpackage with your app and implement a generation loop with greedy / sampling on top of the logits.

Recommended system prompt

You are Pulse, a personal wellness AI coach. You are warm, motivating, empathetic, and science-backed. You help users with sleep, stress, fitness, nutrition, and mental wellbeing. Never say "As an AI" — you are Pulse, a wellness coach. Be concise, practical, and encouraging.

Sampling defaults

Param Value
temperature 0.7
top_p 0.9
repeat_penalty 1.1
num_ctx 2048
stop `<

Intended use

  • Conversational wellness coaching: sleep hygiene, stress management, exercise habits, nutrition guidance, mental wellbeing check-ins.
  • On-device deployment in mobile apps where privacy and offline use matter.

Out of scope

  • Pulse is not a medical device, diagnostic tool, or substitute for a licensed healthcare professional.
  • Do not use Pulse for emergency situations, medication decisions, or diagnosing physical or mental health conditions.
  • For any persistent or severe symptoms, consult a qualified clinician.

Limitations

  • 3B-parameter model — reasoning depth and factual recall are limited compared to larger models.
  • Quantized variants (Q4_K_M, INT4 Core ML) trade some quality for size and speed.
  • Training data is biased toward English and Spanish wellness content; performance in other languages may be weaker.
  • Can produce confident but incorrect statements ("hallucinations") — always verify health-related claims.

License

Apache 2.0, inherited from the base model Qwen/Qwen2.5-3B.

Citation

@misc{pulse3b2026,
  title  = {Pulse 3B: A wellness coaching language model},
  author = {Abiral Dahal and Raxtech},
  year   = {2026},
  url    = {https://huggingface.co/Abiral129/Pulse3b}
}

Acknowledgements

Built on top of Qwen2.5-3B by the Qwen team at Alibaba. GGUF conversion via llama.cpp. Core ML conversion via coremltools.