FR-Ponder Evaluation Results ============================== Dataset: gsm8k -------------------- FR-Ponder: Accuracy: 0.000 Avg FLOPs: 1368148992 Avg Time: 3.324s Avg Steps: 104.3 Baseline (α=0.4): Accuracy: 0.250 Avg FLOPs: 1259616826016 Avg Time: 5.597s Improvements: Accuracy: +-0.250 FLOPs reduction: 99.9% Speedup: 1.68x