GGUF
conversational
How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="jerrimu/IRIS-18B-GGUFS",
	filename="",
)
llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

To build IRIS 18B first we reap pruned ERNIE 21B by 20%, then trained on 3B of thinking traces. We attempted SFT but it was not pretty, may retry SFT/DPO at a later point but releasing like this for now.

These improvements over ERNIE-21B-REAP have been noted

Benchmark Pre-CPT Post-CPT Δ

ARC-Easy 79.6 83.9 +4.3

ARC-Challenge 50.6 60.4 +9.8

HellaSwag 70.5 78.9 +8.4

Winogrande 67.2 72.1 +4.9

Downloads last month
66
GGUF
Model size
18B params
Architecture
ernie4_5-moe
Hardware compatibility
Log In to add your hardware

2-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jerrimu/IRIS-18B-GGUFS

Quantized
(1)
this model

Datasets used to train jerrimu/IRIS-18B-GGUFS