Instructions to use sangwon1472/gemma4-e2b-mud with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use sangwon1472/gemma4-e2b-mud with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="sangwon1472/gemma4-e2b-mud")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("sangwon1472/gemma4-e2b-mud")
model = AutoModelForImageTextToText.from_pretrained("sangwon1472/gemma4-e2b-mud")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use sangwon1472/gemma4-e2b-mud with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="sangwon1472/gemma4-e2b-mud",
	filename="gemma4-e2b-mud-Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use sangwon1472/gemma4-e2b-mud with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf sangwon1472/gemma4-e2b-mud:UD-Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf sangwon1472/gemma4-e2b-mud:UD-Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf sangwon1472/gemma4-e2b-mud:UD-Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf sangwon1472/gemma4-e2b-mud:UD-Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf sangwon1472/gemma4-e2b-mud:UD-Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf sangwon1472/gemma4-e2b-mud:UD-Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf sangwon1472/gemma4-e2b-mud:UD-Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf sangwon1472/gemma4-e2b-mud:UD-Q4_K_M

Use Docker

docker model run hf.co/sangwon1472/gemma4-e2b-mud:UD-Q4_K_M

LM Studio
Jan

vLLM

How to use sangwon1472/gemma4-e2b-mud with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "sangwon1472/gemma4-e2b-mud"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sangwon1472/gemma4-e2b-mud",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/sangwon1472/gemma4-e2b-mud:UD-Q4_K_M

SGLang

How to use sangwon1472/gemma4-e2b-mud with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "sangwon1472/gemma4-e2b-mud" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sangwon1472/gemma4-e2b-mud",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "sangwon1472/gemma4-e2b-mud" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sangwon1472/gemma4-e2b-mud",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use sangwon1472/gemma4-e2b-mud with Ollama:
```
ollama run hf.co/sangwon1472/gemma4-e2b-mud:UD-Q4_K_M
```

Unsloth Studio new

How to use sangwon1472/gemma4-e2b-mud with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for sangwon1472/gemma4-e2b-mud to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for sangwon1472/gemma4-e2b-mud to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for sangwon1472/gemma4-e2b-mud to start chatting

Docker Model Runner
How to use sangwon1472/gemma4-e2b-mud with Docker Model Runner:
```
docker model run hf.co/sangwon1472/gemma4-e2b-mud:UD-Q4_K_M
```

Lemonade

How to use sangwon1472/gemma4-e2b-mud with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull sangwon1472/gemma4-e2b-mud:UD-Q4_K_M

Run and chat with the model

lemonade run user.gemma4-e2b-mud-UD-Q4_K_M

List all available models

lemonade list

gemma4-e2b-mud / DATASET_GUIDE.md

sangwon1472

Add dataset guide and model card links

b7b9673 about 1 month ago

preview code

raw

history blame contribute delete

13.1 kB

	# Dataset Guide

	이 문서는 `gemma4-e2b-mud` 저장소에 함께 들어 있는 데이터셋 묶음을 설명한다.
	목표는 단순히 “어떤 파일이 있다”를 넘어서, 각 파일이 어떤 역할을 하는지, 어떻게 정리되었는지, 비슷한 데이터를 새로 만들려면 무엇을 지켜야 하는지를 한 번에 이해할 수 있게 하는 것이다.

	이 저장소는 모델 저장소이므로, 여기의 데이터셋은 엄밀한 독립 dataset repo 라기보다 재현과 확장 실험을 위한 companion dataset bundle 에 가깝다.
	즉, 모델을 이해하고 다시 학습하거나 변형하고 싶은 사람에게 실용적인 출발점을 제공하는 데 초점을 둔다.

	## 한눈에 보기

	포함된 대표 파일은 아래 네 가지다.

	- `gemma-mud-colab-starter/dataset/combined_1000.jsonl`
	- `gemma-mud-colab-starter/dataset/combined_1000.unsloth_chatml_dedup.jsonl`
	- `gemma-mud-colab-starter/dataset/combined_1000.unsloth_gemma4_messages_dedup.jsonl`
	- `gemma-mud-colab-starter/dataset/gemma4_mud_alpaca_100.jsonl`

	역할을 요약하면:

	- `combined_1000.jsonl`
	원본에 가까운 1000행 메시지 데이터
	- `combined_1000.unsloth_chatml_dedup.jsonl`
	정확 중복 제거 후 ChatML 스타일로 정리한 963행 학습본
	- `combined_1000.unsloth_gemma4_messages_dedup.jsonl`
	위와 같은 내용이지만 Gemma 4 / Unsloth 쪽에서 더 안전하게 쓰기 위한 typed text messages 버전
	- `gemma4_mud_alpaca_100.jsonl`
	가장 작은 예시용 100행 Alpaca 계열 데이터

	## 실제 포함 수량

	현재 저장소에 포함된 파일 기준:

	- `combined_1000.jsonl`: `1000` 행
	- `combined_1000.unsloth_chatml_dedup.jsonl`: `963` 행
	- `combined_1000.unsloth_gemma4_messages_dedup.jsonl`: `963` 행
	- `gemma4_mud_alpaca_100.jsonl`: `100` 행

	즉, 원본 `1000` 행에서 정확 중복 37행이 제거된 정리본이 현재 학습용 기준본으로 쓰인다.

	## 데이터 형식

	## 1. `messages` 형식 데이터

	`combined_1000.jsonl` 와 dedup 버전 2종은 모두 기본적으로 `messages` 구조를 따른다.

	개별 샘플은 보통 아래 3턴으로 이루어진다.

	1. `system`
	2. `user`
	3. `assistant`

	예:

	```json
	{
	"messages": [
	{
	"role": "system",
	"content": "당신은 우주항행 텍스트 MUD의 항로 안내자이자 세계관 해설자다. 답변은 한국어로 하며, 서정성과 실용성을 함께 지닌다. 플레이에 도움이 되는 정보와 분위기 묘사를 함께 준다."
	},
	{
	"role": "user",
	"content": "First Fire Horizon이 어떤 곳인지 설명해줘."
	},
	{
	"role": "assistant",
	"content": "First Fire Horizon은 시작 거점으로, 붉은 잔광과 오래된 기동음이 남아 있는 출항의 분위기가 감도는 장소다."
	}
	]
	}
	```

	### Typed text messages 버전

	`combined_1000.unsloth_gemma4_messages_dedup.jsonl` 에서는 `content` 가 단순 문자열이 아니라 아래처럼 들어간다.

	```json
	{
	"role": "user",
	"content": [
	{
	"type": "text",
	"text": "First Fire Horizon이 어떤 곳인지 설명해줘."
	}
	]
	}
	```

	이 버전은 Gemma 4 계열의 템플릿/processor 경로에서 `content` 타입 해석이 더 명확해지도록 준비한 파일이다.

	## 2. Alpaca 계열 예시 데이터

	`gemma4_mud_alpaca_100.jsonl` 은 아래 필드를 가진 소형 예시 데이터다.

	- `instruction`
	- `input`
	- `output`
	- `text`

	예:

	```json
	{
	"instruction": "look",
	"input": "",
	"output": "당신은 첫 불의 지평의 관측 갑판에 서 있다. ..."
	}
	```

	이 파일은 구조가 단순해서 설명과 첫 실험에는 좋지만, Gemma 4 / Unsloth 기준의 실제 운영 실험에서는 `messages` 형식 쪽이 더 자연스럽다.

	## 데이터의 결

	현재 `combined_1000.unsloth_chatml_dedup.jsonl` 기준의 관찰값은 아래와 같다.

	- 총 행 수: `963`
	- 역할 수:
	- `system`: `963`
	- `user`: `963`
	- `assistant`: `963`
	- 평균 user 길이: 약 `33.0` 문자
	- 평균 assistant 길이: 약 `114.6` 문자
	- 최대 user 길이: `60`
	- 최대 assistant 길이: `362`
	- 고유 system prompt 수: `4`

	즉, 이 데이터셋은 길고 복잡한 multi-turn 대화가 아니라, 짧은 단일 요청 -> 비교적 짧은 대답 구조를 중심으로 한다.

	## system prompt 분포

	정리본 기준으로 system prompt 는 4종이다.

	1. 항로 안내자 / 세계관 해설자: `250`
	2. NPC 대사 작성자: `250`
	3. 퀘스트 및 시스템 메시지 작성자: `250`
	4. 환경 서술가: `213`

	이 분포를 보면 데이터셋이 단일 문체만 밀어붙이는 형태는 아니다.
	대신 아래 네 층을 함께 학습하도록 설계되어 있다.

	- lore / setting explanation
	- NPC voice
	- quest / system flavor text
	- environment / look / scan narration

	## 사용자 입력 경향

	정리본에서 자주 보이는 입력 키워드 예시는 아래와 같다.

	- `신호`: `60`
	- `퀘스트`: `50`
	- `소문`: `50`
	- `연료`: `48`
	- `항로`: `33`
	- `표지등`: `22`
	- `도킹`: `20`
	- `평판`: `18`
	- `수리`: `14`
	- `거래`: `14`
	- `전투`: `10`

	이 분포는 이 데이터셋이 단지 “로어 설명”만 모아둔 것이 아니라, 실제 텍스트 MUD에서 자주 나오는 아래 입력군을 함께 포함한다는 뜻이다.

	- 장소 설명
	- 신호/스캔
	- 퀘스트와 시스템 텍스트
	- 항로/도킹/연료 같은 플레이 관련 표현
	- 평판/소문/거래 등 세계 상호작용

	## 이 데이터셋이 잘하는 것

	- 짧고 분위기 있는 응답
	- 한국어 세계관 설명
	- MUD 톤의 NPC 대화
	- room look / signal / rumor 같은 짧은 명령 반응
	- 설정 문장과 기능 설명이 섞인 텍스트

	## 이 데이터셋만으로는 부족한 것

	이 데이터셋은 “엔진 대체”가 아니라 “서사 레이어 보강”을 목표로 하므로, 아래는 의도적으로 약하거나 비워져 있는 영역이다.

	- 정밀한 상태 추적
	- 수치 기반 전투 규칙
	- 보상 계산
	- 퀘스트 완료 판정
	- 장기 multi-turn 메모리
	- 일반 상식 QA

	즉, 모델이 아래를 잘하게 하고 싶다면 별도 데이터가 더 필요하다.

	- 장기 대화
	- 상세 세계 규칙 준수
	- 엄격한 JSON 출력
	- 툴 호출 / 함수 호출
	- 전투 로그 해설

	## 왜 dedup 버전이 필요한가

	원본 `1000` 행을 그대로 쓰면, 정확히 같은 줄이 반복되는 샘플이 들어가 있다.
	적은 규모의 미세조정에서는 이런 중복이 특정 표현을 과하게 강화해서, 모델이 실제보다 더 좁은 문체를 내도록 만들 수 있다.

	그래서 정리본에서는:

	- exact duplicate 제거
	- 공백/줄바꿈 정리
	- role/content 구조 검증
	- Gemma 4용 typed text messages 버전 추가

	를 거친 `963` 행을 기준본으로 삼았다.

	## 왜 ChatML 버전과 Gemma 4 typed 버전을 둘 다 두는가

	둘의 내용은 사실상 같지만, 용도가 다르다.

	### `combined_1000.unsloth_chatml_dedup.jsonl`

	- 사람이 읽기 편하다
	- 일반적인 chat fine-tuning 포맷과 가깝다
	- 디버깅이 쉽다

	### `combined_1000.unsloth_gemma4_messages_dedup.jsonl`

	- Gemma 4 processor/template 경로에서 더 안전하다
	- `content: [{type: "text", text: ...}]` 구조를 강제할 수 있다
	- multimodal 계열 템플릿과의 충돌을 줄이기 쉽다

	실전에서는 typed messages 버전을 기준으로 두고, 필요하면 최종 `text` 컬럼을 만들어 학습하는 편이 안정적이다.

	## 이 데이터셋으로 새 샘플을 만들 때의 규칙

	새 데이터를 추가할 때는 아래 규칙을 추천한다.

	### 1. 기본 단위는 1샘플 = 1system + 1user + 1assistant

	가능하면 아래 3턴 구조를 유지한다.

	- `system`: 역할과 말투
	- `user`: 짧고 명확한 입력
	- `assistant`: 짧고 톤이 살아 있는 답변

	### 2. user는 짧게, assistant는 응축되게

	이 데이터셋의 강점은 “짧은 입력에 짧고 선명한 출력”이다.
	따라서 새 샘플도 아래 결을 유지하는 편이 좋다.

	- user: 짧은 명령, 짧은 요청, 한 문장 질문
	- assistant: 두세 문단 이하, 가능하면 짧은 서술 중심

	### 3. 시스템 역할을 섞어라

	현재 분포처럼 아래 네 역할을 고르게 섞는 편이 좋다.

	- 세계관 해설자
	- NPC 대사 작성자
	- 퀘스트/시스템 메시지 작성자
	- 환경 서술가

	한 역할만 몰아 넣으면 문체가 한쪽으로만 쏠리기 쉽다.

	### 4. 엔진 판정은 직접 선언하지 말라

	모델이 아래를 직접 선언하는 샘플은 가급적 피한다.

	- 퀘스트 완료
	- 보상 획득
	- 문 열림
	- 전투 승리 확정

	이런 문장은 나중에 모델이 게임 엔진 역할까지 넘보게 만들 수 있다.

	### 5. 현실 상식 강의보다 세계 안의 반응을 우선하라

	이 데이터셋의 목적은 범용 백과사전 QA가 아니다.
	따라서 `Oracle`, `signal`, `route` 같은 단어가 나와도 현실 기술 문서풍 설명이 아니라 세계관 내부 반응이 나오도록 작성하는 편이 좋다.

	## 추천 작성 템플릿

	가장 무난한 작성 템플릿은 아래와 같다.

	```json
	{
	"messages": [
	{
	"role": "system",
	"content": "당신은 우주항행 텍스트 MUD의 환경 서술가다. 답변은 한국어로 하며, room look, scan, 이동 메시지, 위험 경고를 선명하고 플레이 친화적으로 작성한다."
	},
	{
	"role": "user",
	"content": "scan signal"
	},
	{
	"role": "assistant",
	"content": "수신기에는 부서진 맥동 하나가 낮게 걸린다. 신호는 오래전에 끊겼지만, 끝자락에는 아직도 누군가의 호출 습관 같은 규칙이 남아 있다."
	}
	]
	}
	```

	Gemma 4 typed messages 버전으로 바로 만들고 싶으면:

	```json
	{
	"messages": [
	{
	"role": "system",
	"content": [{"type": "text", "text": "당신은 우주항행 텍스트 MUD의 NPC 대사 작성자다. 답변은 한국어로 하며, 역할에 맞는 개성과 게임 플레이 단서를 함께 담는다."}]
	},
	{
	"role": "user",
	"content": [{"type": "text", "text": "talk oracle"}]
	},
	{
	"role": "assistant",
	"content": [{"type": "text", "text": "예언자는 숨을 고른 뒤 낮게 말한다. 첫 불은 길을 밝히는 것이 아니라, 돌아올 자리를 기억하게 만든다."}]
	}
	]
	}
	```

	## 추천 제작 순서

	새 데이터셋을 만들 때는 아래 순서를 권한다.

	1. 역할군을 먼저 나눈다
	- lore / NPC / quest-system / environment
	2. user 입력군을 먼저 정한다
	- 명령형, 설명형, 힌트형
	3. assistant 길이를 제한한다
	- 너무 길어지지 않게
	4. exact duplicate 를 제거한다
	5. ChatML 버전과 Gemma 4 typed 버전을 모두 만든다
	6. 실제 추론 테스트를 먼저 해본다
	- `talk oracle`
	- `look`
	- `scan signal`
	- `First Fire Horizon이 어떤 곳인지 설명해줘.`

	## 품질 점검 체크리스트

	데이터셋을 추가하거나 수정할 때는 아래를 보는 것이 좋다.

	- 한국어 톤이 일관적인가
	- system 역할이 너무 한쪽으로 쏠리지 않는가
	- user 입력이 지나치게 길지 않은가
	- assistant 답변이 장황해지지 않는가
	- 현실 상식 강의로 새는 샘플이 섞이지 않았는가
	- 게임 엔진 판정 문장이 들어가 있지 않은가
	- exact duplicate 가 남아 있지 않은가
	- Gemma 4 템플릿 경로에서 `content` 타입 충돌이 없는가

	## 이 저장소에서 실제로 쓰는 준비 스크립트

	starter package 안에는 데이터 준비 스크립트도 포함되어 있다.

	- [`gemma-mud-colab-starter/scripts/prepare_unsloth_gemma4_dataset.py`](./gemma-mud-colab-starter/scripts/prepare_unsloth_gemma4_dataset.py)

	이 스크립트는 메시지 정리, dedup, Unsloth/Gemma 4용 포맷 생성 같은 반복 작업을 다시 하기 쉽게 남겨 둔 것이다.

	## 앞으로 확장하려면

	지금 데이터셋은 “스타일과 구조를 익히는 첫 묶음”으로는 충분하지만, 더 높은 품질을 원하면 아래 방향이 좋다.

	- NPC별 전용 데이터셋 분리
	- 지역/거점별 세계관 사전 확장
	- quest-system 메시지 분리
	- `look` / `rumor` / `scan` 계열을 더 많이 확보
	- 장기 대화형 데이터 추가
	- 엔진 후처리 규칙과 함께 평가

	## 모델 카드에서 이 문서를 왜 따로 두는가

	모델 카드는 보통 아래를 빠르게 보여 줘야 한다.

	- 무엇을 하는 모델인가
	- 어떻게 쓰는가
	- 어떤 파일이 있는가

	반면 데이터셋 분석은 아래가 필요하다.

	- 정제 과정
	- 포맷 차이
	- 중복 제거 이유
	- 작성 규칙
	- 확장 방향

	둘을 한 문서에 모두 밀어 넣으면 모델 카드가 지나치게 길어지고, 핵심 사용자가 빨리 정보를 찾기 어려워진다.
	그래서 이 저장소는 모델 카드와 데이터셋 가이드를 분리하는 쪽이 더 적합하다.