Metrics are not matching for ARC 1 and ARC 2

by vedantdere - opened 20 days ago

20 days ago

I tried the evaluation using the https://github.com/SamsungSAILMontreal/TinyRecursiveModels/blob/main/evaluators/arc.py
But the metrics are very low for ARC 1 its giving

{'all': {'accuracy': np.float32(0.65308), 'exact_accuracy': np.float32(0.021479713), 'lm_loss': np.float32(1.8157994), 'q_halt_accuracy': np.float32(0.8997613), 'q_halt_loss': np.float32(0.28415218), 'steps': np.float32(16.0)}, 'ARC/pass@1': 0.02375, 'ARC/pass@2': 0.02375, 'ARC/pass@5': 0.02375, 'ARC/pass@10': 0.02375, 'ARC/pass@100': 0.02375, 'ARC/pass@1000': 0.02375}

and for ARC 2 its giving

{'all': {'accuracy': np.float32(0.6418964), 'exact_accuracy': np.float32(0.0), 'lm_loss': np.float32(1.7541337), 'q_halt_accuracy': np.float32(0.9186047), 'q_halt_loss': np.float32(0.29709587), 'steps': np.float32(16.0)}, 'ARC/pass@1': 0.0, 'ARC/pass@2': 0.0, 'ARC/pass@5': 0.0, 'ARC/pass@10': 0.0, 'ARC/pass@100': 0.0, 'ARC/pass@1000': 0.0}

So please let me know If I am evaluating it wrong or there is other issue

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment