To build IRIS 18B first we reap pruned ERNIE 21B by 20%, then trained on 3B of thinking traces. We attempted SFT but it was not pretty, may retry SFT/DPO at a later point but releasing like this for now.
These improvements over ERNIE-21B-REAP have been noted
Benchmark Pre-CPT Post-CPT Δ
ARC-Easy 79.6 83.9 +4.3
ARC-Challenge 50.6 60.4 +9.8
HellaSwag 70.5 78.9 +8.4
Winogrande 67.2 72.1 +4.9
- Downloads last month
- 31
Hardware compatibility
Log In to add your hardware
2-bit
8-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for jerrimu/IRIS-18B-GGUFS
Base model
baidu/ERNIE-4.5-21B-A3B-Base-PT Finetuned
jerrimu/ERNIE-21B-REAP Finetuned
jerrimu/IRIS-18B-CPT