SBCC Message Generator - Qwen2.5-7B

Fine-tuned Qwen2.5-7B model for generating Social and Behavior Change Communication (SBCC) messages for rural health and agriculture practices.

Model Description

This model generates contextually appropriate, culturally sensitive health and agriculture messages that:

Address specific household barriers (economic, knowledge, space, time, social, physical, infrastructure, environmental)
Provide practical solutions within existing constraints
Use natural, conversational language accessible to non-native speakers
Follow evidence-based SBCC principles (Accurate, Clear, Respectful, Encouraging)
Generate messages of optimal length (260-343 characters)

Base Model: unsloth/Qwen2.5-7B-Instruct-bnb-4bit Fine-tuning Method: QLoRA with 4-bit quantization Training Framework: Unsloth + TRL

Training Configuration

LoRA Rank: 64
LoRA Alpha: 16
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Learning Rate: 2e-4
Epochs: 5
Batch Size: 4 (gradient accumulation: 4)
Max Sequence Length: 2048

Usage

Installation

pip install transformers torch unsloth

Inference Code

import json
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "wcosmas/sbcc-qwen"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# System prompt (critical for quality output)
SYSTEM_PROMPT = '''You write natural, helpful SBCC messages for rural households.

CRITICAL RULES:
1. NEVER use formulaic openings like "It's easy to forget..." or "It's normal to feel..."
2. Each opening must be UNIQUE and match the specific barrier emotion
3. Vary your language - don't repeat phrases across messages
4. Natural conversational tone (like talking to a neighbor, not a script)
5. Address the ACTUAL barrier constraint with alternatives that work WITHIN limitations

SBCC MESSAGE QUALITY METRICS:
- ACCURATE: Information is factually correct, evidence-based
- CLEAR: Simple language, specific implementable steps
- RESPECTFUL: Maintains dignity, respects local practices
- ENCOURAGING: Motivates positive behavior change

CRITICAL REQUIREMENTS:
1. DIRECTLY address barrier with alternatives that work WITHIN limitations
2. NATURAL MESSAGE STRUCTURE (4 parts, 260-343 chars):
   - Acknowledge SPECIFIC barrier naturally
   - Practical solution WITHIN the limitation
   - Concrete details (measurements, materials, steps)
   - Tangible benefit
3. LENGTH: 260-343 characters (STRICT)
4. FORMAT: Plain text, NO markdown, NO greetings

Return valid JSON: {"practice": "...", "barrier": "...", "message_type": "...", "sbcc_message": "...", "additional_instructions": "...", "deficiencies": [...]}'''

def generate_sbcc_message(practice, message_type="adoption", barrier=None, deficiencies=None):
    # Prepare input
    input_data = {"practice": practice, "message_type": message_type}
    if barrier:
        input_data["barrier_text"] = barrier
    if deficiencies:
        input_data["deficiencies"] = deficiencies

    # Build prompt
    user_msg = f"Generate an SBCC message for: {json.dumps(input_data, ensure_ascii=False)}"
    prompt = f"<|im_start|>system\n{SYSTEM_PROMPT}<|im_end|>\n<|im_start|>user\n{user_msg}<|im_end|>\n<|im_start|>assistant\n"

    # Tokenize
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    # Generate
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=300,
            temperature=0.3,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )

    # Decode
    response = tokenizer.decode(
        outputs[0][len(inputs["input_ids"][0]):],
        skip_special_tokens=True
    ).strip()

    # Parse JSON
    try:
        return json.loads(response)
    except json.JSONDecodeError:
        return {
            "practice": practice,
            "barrier": barrier,
            "message_type": message_type,
            "sbcc_message": response,
            "deficiencies": deficiencies or []
        }

# Example usage
result = generate_sbcc_message(
    practice="Handwashing",
    message_type="adoption",
    barrier="Soap is too expensive for daily use"
)

print(f"Message: {result['sbcc_message']}")
print(f"Length: {len(result['sbcc_message'])} characters")

Example Outputs

Input:

{
    "practice": "Handwashing",
    "message_type": "adoption",
    "barrier_text": "Soap is too expensive for daily use"
}

Output:

"I understand soap can be costly. You can make simple handwashing solution with ash and water—mix 1 cup wood ash with 2 cups water, strain it. Use this solution to wash hands after toilet and before cooking. This costs nothing and still removes germs effectively to protect your family's health."

Input:

{
    "practice": "Pit Composting",
    "message_type": "adoption",
    "barrier_text": "Land size is too small for proper composting"
}

Output:

"I hear your land is small. You can still compost in a small space—build a simple heap 3 ft × 3 ft × 3 ft against your fence or house wall. Layer kitchen scraps, dry leaves, and a bit of soil. Turn it every week and keep moist. In 6-8 weeks you'll have rich compost to improve your vegetables."

GGUF Format (Ollama/LM Studio)

A quantized GGUF version (Q4_K_M) is available for local deployment:

# Download GGUF file from the repo
# Use with Ollama or LM Studio

Message Types

Adoption Messages: Encourage starting a new practice, addressing specific barriers
Improvement Messages: Guide enhancement of existing practices based on observed deficiencies

Limitations

Optimized for rural health/agriculture contexts in East Africa
Generates messages in English (260-343 character range)
Requires careful prompt engineering for best results
May need human review for cultural appropriateness in specific contexts

Citation

If you use this model, please cite:

@misc{sbcc-qwen-2025,
  author = {Wamozo Cosmas},
  title = {SBCC Message Generator - Qwen2.5-7B},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/wcosmas/sbcc-qwen}
}

License

Apache 2.0 (inherited from base model)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wcosmas/sbcc-qwen

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Quantized

unsloth/Qwen2.5-7B-Instruct-bnb-4bit