Add IFStruct v1.0 evaluation result

#8
by SaylorTwift HF Staff - opened

Add IFStruct v1.0 evaluation result for LiquidAI/LFM2.5-350M

Summary

This PR adds an IFStruct v1.0 evaluation result extracted from the Liquid AI IFStruct v1.0 blog to the .eval_results/ directory, following the Hugging Face Hub evaluation-results specification.

Benchmark Added

Benchmark Score Hub Dataset Task ID Notes
IFStruct v1.0 44.90 LiquidAI/ifstruct-v1.0 ifstruct_v1 + IFStruct RL

This is the LFM2.5-350M model after IFStruct RL fine-tuning, as reported in the blog's leaderboard. (The base LFM2.5-350M scores 21.10 on the same benchmark.)

Source

Files Added

  • .eval_results/LFM2.5-350M.yaml

Verification

These results were extracted from the official benchmark table published in the IFStruct v1.0 blog. No verified token is provided as these were not run via HF Jobs with inspect-ai.

mlabonne changed pull request status to merged

Sign up or log in to comment