AI-MO/NuminaMath-CoT
Viewer • Updated • 860k • 62.5k • 578
How to use jerrimu/IRIS-18B-GGUFS with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="jerrimu/IRIS-18B-GGUFS", filename="IRIS-BF16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
How to use jerrimu/IRIS-18B-GGUFS with llama.cpp:
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jerrimu/IRIS-18B-GGUFS:BF16 # Run inference directly in the terminal: llama-cli -hf jerrimu/IRIS-18B-GGUFS:BF16
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jerrimu/IRIS-18B-GGUFS:BF16 # Run inference directly in the terminal: llama-cli -hf jerrimu/IRIS-18B-GGUFS:BF16
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf jerrimu/IRIS-18B-GGUFS:BF16 # Run inference directly in the terminal: ./llama-cli -hf jerrimu/IRIS-18B-GGUFS:BF16
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf jerrimu/IRIS-18B-GGUFS:BF16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf jerrimu/IRIS-18B-GGUFS:BF16
docker model run hf.co/jerrimu/IRIS-18B-GGUFS:BF16
How to use jerrimu/IRIS-18B-GGUFS with Ollama:
ollama run hf.co/jerrimu/IRIS-18B-GGUFS:BF16
How to use jerrimu/IRIS-18B-GGUFS with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jerrimu/IRIS-18B-GGUFS to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jerrimu/IRIS-18B-GGUFS to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for jerrimu/IRIS-18B-GGUFS to start chatting
How to use jerrimu/IRIS-18B-GGUFS with Docker Model Runner:
docker model run hf.co/jerrimu/IRIS-18B-GGUFS:BF16
How to use jerrimu/IRIS-18B-GGUFS with Lemonade:
# Download Lemonade from https://lemonade-server.ai/ lemonade pull jerrimu/IRIS-18B-GGUFS:BF16
lemonade run user.IRIS-18B-GGUFS-BF16
lemonade list
To build IRIS 18B first we reap pruned ERNIE 21B by 20%, then trained on 3B of thinking traces. We attempted SFT but it was not pretty, may retry SFT/DPO at a later point but releasing like this for now.
These improvements over ERNIE-21B-REAP have been noted
Benchmark Pre-CPT Post-CPT Δ
ARC-Easy 79.6 83.9 +4.3
ARC-Challenge 50.6 60.4 +9.8
HellaSwag 70.5 78.9 +8.4
Winogrande 67.2 72.1 +4.9
2-bit
8-bit
16-bit
Base model
baidu/ERNIE-4.5-21B-A3B-Base-PT