pinned Running on Zero Agents MiniCPM5-1B Eval Benchmark ๐ค Generate and evaluate MiniCPM5-1B model responses