Add model-index with benchmark evaluations

#20
by davidlms - opened

Added structured evaluation results from README benchmark table:

Automated Benchmarks:

  • MMLU: 55.23
  • GPQA: 31.47
  • IFEval (Instruction following): 74.89
  • IFBench: 20.7
  • GSM8K (Math reasoning): 58.3
  • MGSM (Multilingual math): 55.04
  • MMMLU (Multilingual MMLU): 46.73

Total: 7 benchmarks across reasoning, instruction-following, and multilingual capabilities.

This enables the model to appear in leaderboards and makes it easier to compare with other models.

Note: PR #6 (Support tool calls) modifies the tokenizer configuration and should not conflict with this metadata addition.

Cannot merge
This branch has merge conflicts in the following files:
  • README.md

Sign up or log in to comment