Instructions to use aixk/haru-180m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use aixk/haru-180m with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="aixk/haru-180m",
	filename="sai_f16.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use aixk/haru-180m with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf aixk/haru-180m:F16
# Run inference directly in the terminal:
llama-cli -hf aixk/haru-180m:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf aixk/haru-180m:F16
# Run inference directly in the terminal:
llama-cli -hf aixk/haru-180m:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf aixk/haru-180m:F16
# Run inference directly in the terminal:
./llama-cli -hf aixk/haru-180m:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf aixk/haru-180m:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf aixk/haru-180m:F16

Use Docker

docker model run hf.co/aixk/haru-180m:F16

LM Studio
Jan

vLLM

How to use aixk/haru-180m with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "aixk/haru-180m"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "aixk/haru-180m",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/aixk/haru-180m:F16

Ollama
How to use aixk/haru-180m with Ollama:
```
ollama run hf.co/aixk/haru-180m:F16
```

Unsloth Studio new

How to use aixk/haru-180m with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for aixk/haru-180m to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for aixk/haru-180m to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for aixk/haru-180m to start chatting

Docker Model Runner
How to use aixk/haru-180m with Docker Model Runner:
```
docker model run hf.co/aixk/haru-180m:F16
```

Lemonade

How to use aixk/haru-180m with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull aixk/haru-180m:F16

Run and chat with the model

lemonade run user.haru-180m-F16

List all available models

lemonade list

ISAI - 이사이

I’m an independent developer building and maintaining AI projects on my own.
Everything from model development to server costs, datasets, and feature updates is handled personally.
Every bit of support helps keep the project running and allows me to improve it further.
If you enjoy the project, please consider supporting it. Thank you.

혼자서 AI 프로젝트를 개발하고 운영하고 있습니다.
모델 개발, 서버 비용, 기능 개선까지 모두 개인이 직접 진행하고 있습니다.
작은 후원 하나하나가 서비스 유지와 새로운 기능 개발에 큰 도움이 됩니다.
프로젝트가 마음에 드셨다면 후원으로 응원해주세요. 감사합니다.

ISAI	ollapp	Addly	blogig
logig	AI Magician	99s	Global Stock
AI Archive	wikiwi	wwwiki	Oduck
lai	spirit browser	799	thedeouk
wallpaper forum	webbar	Stode	OMAP
hummorabbit	ollone	ranovel.kr	adsense forum

Model Description

Haru-180M is a lightweight language model built upon SmolLM2-135M. It has been specifically optimized to enhance Korean language capabilities and features an expanded model depth, providing a more robust performance while maintaining efficiency for various AI-driven tasks.

Downloads last month: 280

Safetensors

Model size

0.2B params

Tensor type

F32