You3dimgeo's picture
Upload CycleQD Qwen2.5-7B Math α=0.20 - 14.29% accuracy on HLE Math
813f638 verified
metadata
license: apache-2.0
language:
  - en
library_name: transformers
tags:
  - math
  - cycleqd
  - qwen2.5
  - merge
datasets:
  - HLE-Math
base_model:
  - Qwen/Qwen2.5-7B
  - Qwen/Qwen2.5-7B-Instruct
model-index:
  - name: CycleQD-Qwen2.5-7B-Math-Alpha020
    results:
      - task:
          type: mathematical-reasoning
          name: HLE Math
        dataset:
          name: HLE Math
          type: hle-math
        metrics:
          - type: accuracy
            value: 14.29
            name: accuracy
            verified: true
            details: 5 correct out of 35 questions

CycleQD Qwen2.5-7B Math α=0.20

This model is a CycleQD-merged version of Qwen2.5-7B, specifically tuned for mathematical reasoning tasks.

Model Details

  • Base Models: Qwen2.5-7B and Qwen2.5-7B-Instruct
  • Merge Method: Linear interpolation with α=0.20
  • Formula: (1-0.20) × Qwen2.5-7B + 0.20 × Qwen2.5-7B-Instruct
  • Created: August 14, 2025
  • Model Size: 7B parameters (~15GB)

Performance

This model achieved significant improvement on the HLE Math evaluation:

  • Accuracy: 14.29% (5/35 questions correct)
  • Improvement: 5x improvement from baseline (1/35 → 5/35)
  • Evaluation: HLE Math category
  • Judge Model: Qwen2.5-32B-Instruct

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("You3dimgeo/cycleqd-qwen25-7b-math-alpha020")
tokenizer = AutoTokenizer.from_pretrained("You3dimgeo/cycleqd-qwen25-7b-math-alpha020")

# Use for mathematical reasoning
prompt = "Solve the equation: 2x^2 + 5x - 3 = 0"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0]))

Training/Merge Details

This model was created using the CycleQD (Cycle Quality-Diversity) approach:

  • Started with Qwen2.5-7B base model
  • Applied CycleQD optimization for mathematical reasoning
  • Merged with instruction-tuned variant using α=0.20
  • Evaluated on HLE Math benchmark

License

This model is licensed under Apache 2.0, following the original Qwen2.5 license.