HeuristixAI β€” Dual-Path Disagreement Resolution Model

A QLoRA fine-tuned adapter for HAI-DualPath-0.5B trained to generate two competing answers, explicitly identify the disagreement between them, and resolve it into a final correct answer.

This is Project 2 of the HeuristixAI research series.
Project 1: HAI-ReflectMini-0.5B

by HeuristixAI Β· Research Paper


Model Description

Most small language models produce a single answer directly. This model is trained to reason through competing hypotheses before committing β€” generating Answer A, Answer B, identifying what specifically conflicts, then resolving to a final answer.

This structured disagreement-resolution pattern is a novel training schema not previously demonstrated at sub-1B parameter scale.


Training Details

Parameter Value
Base model Qwen2.5-0.5B-Instruct
Method QLoRA (4-bit quantization)
LoRA Rank 8
LoRA Alpha 16
LoRA Dropout 0.05
Epochs 3
Learning Rate 2e-4
Context Length 768 tokens
Peak VRAM 2.33 GB
Training Time ~55 minutes
Hardware NVIDIA GTX 1650 (4GB)
Final Train Loss 1.733

Dataset

160 structured samples across two domains:

  • Logic / Math (80 samples): arithmetic traps, logical syllogisms, probability puzzles, rate/work/time problems
  • Common Sense (80 samples): causal reasoning, social judgment, science intuition, decision making

Each sample contains five fields: prompt, answer_a, answer_b, disagreement, resolution.


Output Format

Given a prompt, the model responds in this structure:

**Answer A:** [first reasoning path]
**Answer B:** [competing reasoning path]
**Disagreement:** [specific conflict between A and B]
**Resolution:** [final adjudicated answer with justification]

Evaluation

Evaluated on 20 held-out prompts not present in training data.

Metric Result
Dual-path format adherence 20 / 20 (100%)
Disagreement field present 20 / 20 (100%)

Ablation finding: A model trained without the Disagreement field achieves lower training loss (1.421 vs 1.488) but produces weaker resolution quality β€” confirming that explicit disagreement identification acts as a necessary intermediate reasoning scaffold.


Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

model_name = "Qwen/Qwen2.5-0.5B-Instruct"
adapter_path = "heuristixai/HAI-DualPath-0.5B"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True
)

tokenizer = AutoTokenizer.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, adapter_path)

prompt = "A bat and a ball cost \$1.10 total. The bat costs \$1 more than the ball. How much does the ball cost?"
formatted = f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"

inputs = tokenizer(formatted, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=400, temperature=0.7, do_sample=True)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Limitations

  • Base model is 0.5B parameters β€” factual accuracy is limited on complex scientific or mathematical problems
  • The model reliably produces the correct reasoning structure but may arrive at incorrect conclusions on problems requiring deep domain knowledge
  • Trained on 160 samples β€” a larger dataset would improve factual reliability

Citation

If you use this model in your research, please cite:

@misc{heuristixai2026dualpathqwen,
  title={Dual-Path Disagreement Resolution in Small Language Models},
  author={HeuristixAI},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/heuristixai/HAI-DualPath-0.5B}
}

HeuristixAI Research Series

Project Model Method
Project 1 HAI-ReflectMini-0.5B Self-reflective critique via LoRA
Project 2 HAI-DualPath-0.5B Dual-path disagreement resolution via QLoRA
Downloads last month
67
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for heuristixai/HAI-DualPath-0.5B

Adapter
(450)
this model