Original Model: LiquidAI/LFM2-8B-A1B

Base ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

  1. REAP์œผ๋กœ 6B-A1B๋กœ ๋ณ€๊ฒฝ
    • ์‚ฌ์šฉ ๋ฐ์ดํ„ฐ: allenai/tulu-3-sft-personas-math
    • ๋‹ค๊ตญ์–ด ๋Šฅ๋ ฅ/์ฝ”๋”ฉ ๋Šฅ๋ ฅ์„ ์ œ๊ฑฐํ•˜๊ณ  ๋‚˜๋จธ์ง€ ๋Šฅ๋ ฅ์„ ๋‚จ๊ธฐ๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ ์กฐํ•ฉ์„ ์‹คํ—˜ํ–ˆ์ง€๋งŒ, ์ˆ˜ํ•™ ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๋Š” ๊ฒŒ ์ œ์ผ ๋ณด์กด๋˜๋Š” ํ‰๊ท  ์ ์ˆ˜๊ฐ€ ๋†’์•˜์Œ.
  2. LowRankClone์œผ๋กœ 3B-A0.6B๋กœ ๋ณ€๊ฒฝ

5-shot ๋ฒค์น˜(lm-evaluation-harness, loglikelihood test)

  • ARC-C, HellaSwag๋Š” acc_norm ์‚ฌ์šฉ
Name Param Active MMLU GPQA_main PIQA ARC-C HellaSwag
LiquidAI/LFM2-8B-A1B 8.3B 1.2B 64.84 25.89 76.44 60.75 73.34
LiquidAI/LFM2-2.6B 2.6B 2.6B 64.62 35.04 77.53 56.14 72.37
LiquidAI/LFM2-1.2B 1.2B 1.2B 55.12 30.36 73.50 56.06 63.46
LiquidAI/LFM2-700M 0.7B 0.7B 49.42 31.03 72.25 51.19 58.02
werty1248/LFM2-6B-A1B-REAP 6.3B 1.2B 56.43 32.37 71.27 52.73 65.33
werty1248/LRCLFM2MoE-base-checkpoint-160000 3.2B 0.6B 44.49 29.69 71.11 49.40 54.77

๊ตํ›ˆ: REAP๋Š” ๋ชจ๋“  ํ•™์Šต์ด ๋๋‚œ ๋‹ค์Œ์— ํ•˜์ž(distillation์ด ๋‚˜์˜๊ฒŒ ๋จ)

์ด ๋น„์šฉ: 2x B200 ์—์„œ 110์‹œ๊ฐ„ (~$600)

Downloads last month
50
Safetensors
Model size
3B params
Tensor type
F32
ยท
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support