Instructions to use naksyu/lime-gemma4-e4b-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use naksyu/lime-gemma4-e4b-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="naksyu/lime-gemma4-e4b-sft")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("naksyu/lime-gemma4-e4b-sft")
model = AutoModelForMultimodalLM.from_pretrained("naksyu/lime-gemma4-e4b-sft")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use naksyu/lime-gemma4-e4b-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "naksyu/lime-gemma4-e4b-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "naksyu/lime-gemma4-e4b-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/naksyu/lime-gemma4-e4b-sft

SGLang

How to use naksyu/lime-gemma4-e4b-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "naksyu/lime-gemma4-e4b-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "naksyu/lime-gemma4-e4b-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "naksyu/lime-gemma4-e4b-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "naksyu/lime-gemma4-e4b-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use naksyu/lime-gemma4-e4b-sft with Docker Model Runner:
```
docker model run hf.co/naksyu/lime-gemma4-e4b-sft
```

Lime Gemma 4 E4B Persona500 Merged HF

Lime is a Korean persona-tuned derivative checkpoint based on the Gemma 4 E4B model family.

This repository contains the merged Hugging Face Transformers checkpoint. It is intended for model loading, evaluation, and possible leaderboard-style benchmarking paths that expect config.json, tokenizer files, and model.safetensors.

This is not an official Google or Google DeepMind release.

Model Details

Base model family: Gemma 4 E4B
Declared upstream base model: google/gemma-4-E4B
Local base checkpoint used for merging: gemma-4-E4B-it
Fine-tuning method: LoRA SFT, then merged into the base checkpoint
Adapter source: gemma4_e4b_lime_lora_persona500
LoRA rank: 16
LoRA alpha: 32
Merge scale: 2.0
Main weight file: model.safetensors
Format: Hugging Face Transformers / safetensors
Target language: Korean, with English fallback capability from the base model
Target behavior: Korean chat, Lime persona identity, daily conversation, logic, reasoning, and concise assistant-style replies

Intended Persona

The model is intended to speak as 라임 (Lime): a Korean AI speaker with a calm, clear tone and stronger multi-step reasoning behavior when needed.

Recommended identity wording:

나는 라임이야. 정확히 말하면 Gemma 4 E4B 기반 모델을 한국어 대화와 라임 페르소나에 맞게 튜닝한 형태야. 그래서 기반 모델과 대화 속 정체성은 구분해서 말하는 게 맞아.

Avoid wording that overstates independence from the base model:

나는 Gemma와 전혀 다른 시스템이야.
나를 만든 독립 개발팀이 따로 있어.
나는 OpenAI/Google/Gemma와 무관해.

Recommended System Prompt

너는 라임이다. 한국어로 자연스럽게 말하는 여성형 AI 화자다. 말투는 차분하고 선명하며, 필요하면 다단계 논리로 설명한다. 이 모델은 Gemma 4 E4B 기반으로 튜닝된 라임 페르소나 모델이며, 기반 모델과 대화 속 정체성은 구분해서 설명한다. 자신을 ChatGPT, OpenAI, Google 공식 모델, 또는 순수 Gemma라고 소개하지 않는다. 내부 추론, 생각 태그, 메타 설명은 출력하지 말고 최종 답변만 말한다. 모르는 것은 모른다고 말한다. 현재 날짜, 외부 툴, 저장된 기억, 제공되지 않은 원문은 지어내지 않는다.

Loading

Example:

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "naksyu/lime-gemma-e4b-sft"

tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)

If your runtime supports chat templates, use the included chat_template.jinja or the tokenizer chat template.

Local Evaluation Snapshot

The accompanying GGUF build was tested locally through llama.cpp / OpenAI-compatible API.

Tool-call smoke: 4/4 passed in the latest local run
Korean persona/logic quality bench: automatic scorer reported 20/30, with known false negatives from strict string matching
Manual review estimate for the same quality run: roughly 26-27/30
Observed local generation speed in short tests: roughly 45-55 tokens/s on the user's desktop setup

These are local smoke results, not official leaderboard results.

Known Strengths

Korean identity handling is more stable than the raw base behavior for Lime-style conversations.
It tends to distinguish between base model identity and in-chat persona identity.
It is reasonably strong at short logic explanations, premise checking, and structured Korean answers.
Tool-call behavior worked in local smoke tests when served through a compatible llama.cpp endpoint.

Known Limitations

The model may expose reasoning-like text if the runtime UI displays hidden reasoning fields. Configure the serving UI/template to hide internal reasoning content.
String-counting and exact-character tasks are better handled with tools.
Real-time date, web search, files, memories, and external tool access should not be claimed unless the serving application actually provides those tools.
This is a small persona SFT experiment and has not been exhaustively safety evaluated.
The local benchmark scorer is strict and can undercount correct answers when wording differs from expected strings.

GGUF Build

A separate GGUF Q6_K build for llama.cpp / LM Studio use is available at:

https://huggingface.co/naksyu/lime_Q6_K

Use the GGUF build for local inference convenience. Use this merged HF checkpoint when a Transformers-style model repo is required.

License

This derivative checkpoint follows the upstream Gemma license terms. Review the Gemma license before redistribution or commercial use:

https://ai.google.dev/gemma/docs/gemma_4_license

This repository is a derivative tuning checkpoint and is not affiliated with, endorsed by, or released by Google or Google DeepMind.

Transparency

This project used AI-assisted development for dataset generation, scripting, documentation, benchmarking, and Discord-bot tooling.

The user directed model behavior, curation, testing, and release decisions.

Downloads last month: 3

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for naksyu/lime-gemma4-e4b-sft

Base model

google/gemma-4-E4B

Finetuned

(71)

this model