Text Generation
Transformers
Safetensors
Korean
English
gemma3_text
korean
defense
instruction-tuned
domain-adaptive
conversational
text-generation-inference
Instructions to use graphuser/kordef-12b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use graphuser/kordef-12b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="graphuser/kordef-12b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("graphuser/kordef-12b") model = AutoModelForCausalLM.from_pretrained("graphuser/kordef-12b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use graphuser/kordef-12b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "graphuser/kordef-12b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "graphuser/kordef-12b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/graphuser/kordef-12b
- SGLang
How to use graphuser/kordef-12b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "graphuser/kordef-12b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "graphuser/kordef-12b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "graphuser/kordef-12b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "graphuser/kordef-12b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use graphuser/kordef-12b with Docker Model Runner:
docker model run hf.co/graphuser/kordef-12b
| license: gemma | |
| language: | |
| - ko | |
| - en | |
| base_model: | |
| - google/gemma-3-12b-it | |
| pipeline_tag: text-generation | |
| tags: | |
| - korean | |
| - defense | |
| - instruction-tuned | |
| - domain-adaptive | |
| library_name: transformers | |
| # KorDef-LLM | |
| **Korean Defense Domain Instruction-Tuned Language Model** | |
| KorDef-LLM is a 12B-parameter language model fine-tuned from `google/gemma-3-12b-it` on a domain-specific instruction corpus drawn from publicly available, unclassified Korean defense administrative-rule (ํ์ ๊ท์น) and educational PDFs. | |
| This model accompanies the manuscript **"An Open Pipeline for Domain-Adaptive Instruction Tuning of Korean Defense Large Language Models"** (submitted to PeerJ Computer Science). It is released for **research and educational use** only, with the limitations and out-of-scope uses described below. | |
| ## Released Artifacts | |
| | Component | Location | | |
| |---|---| | |
| | Model weights (this page) | [HuggingFace `graphuser/kordef-12b`](https://huggingface.co/graphuser/kordef-12b) | | |
| | Instruction corpus + evaluation set | [Zenodo `10.5281/zenodo.20083055`](https://doi.org/10.5281/zenodo.20083055) | | |
| | Inference and evaluation code | [GitHub `gshwan22/KorDef-LLM`](https://github.com/gshwan22/KorDef-LLM) | | |
| ## Model Description | |
| - **Base model**: `google/gemma-3-12b-it` (Gemma-3, 12B parameters, instruction-tuned) | |
| - **Fine-tuning**: Supervised instruction tuning (full SFT, FSDP distributed; not LoRA) | |
| - **Domain**: Korean defense administrative rules, doctrine documents, and educational materials (all publicly available, unclassified) | |
| - **Training corpus**: Combined prompt-generated and document-grounded instructionโresponse pairs; the prompt-generated subset (235,367 pairs) is publicly released via Zenodo | |
| - **Training steps**: 7,875 | |
| ## Intended Use | |
| KorDef-LLM is intended for: | |
| - Research on Korean professional-domain language modeling and domain adaptation | |
| - Educational reference-style question answering over Korean defense administrative-rule documents | |
| - Comparison studies and reproducibility evaluations in Korean NLP | |
| - A base model for further research-oriented fine-tuning in related Korean professional domains | |
| The model is **NOT** intended for: | |
| - Autonomous decision-making in military operations, procurement, maintenance, targeting, or any safety-critical procedure | |
| - Generation of classified, sensitive, or operationally restricted content | |
| - Deployment in real-world high-stakes settings without institutional review, retrieval grounding, and human expert oversight | |
| - Any use that violates applicable laws, regulations, or the Gemma Terms of Use | |
| ## Evaluation Summary | |
| KorDef-LLM was evaluated on two complementary benchmarks; full details are reported in the accompanying paper. | |
| ### KMMLU (general Korean reasoning, 5-shot) | |
| | Model | KMMLU (%) | | |
| |---|---| | |
| | A.X-4.0-Light | 55.7 | | |
| | **KorDef-LLM (ours)** | **48.0** | | |
| | Gemma-3-12B (base) | 46.0 | | |
| | Qwen-2.5-7B-Instruct | 45.8 | | |
| | EXAONE-3.5-7.8B-Instruct | 45.3 | | |
| | Llama-3.1-8B-Instruct | 41.6 | | |
| KorDef-LLM ranks second among six compared models on KMMLU, exceeding the base model and three additional open Korean/multilingual baselines, indicating that domain-adaptive instruction tuning preserves general Korean reasoning ability. | |
| ### Source-Grounded Evaluation (N=323, public defense PDFs) | |
| Paired comparison against the base Gemma-3-12B under identical context, prompt, and decoding conditions: | |
| | Metric | Gemma-3-12B | **KorDef-LLM** | ฮ | p (Wilcoxon) | | |
| |---|---|---|---|---| | |
| | Token-F1 | 0.398 | **0.428** | +0.030 | < 1e-7 | | |
| | ROUGE-L | 0.380 | **0.402** | +0.022 | < 1e-3 | | |
| | Character 3-gram Jaccard | 0.258 | **0.281** | +0.023 | < 1e-4 | | |
| | Evidence-token recall | 0.534 | 0.549 | +0.015 | 0.108 (n.s.) | | |
| | Mean answer tokens | 45.2 | 41.2 | โ4.0 | < 1e-11 | | |
| Statistically significant improvements over the base model in three content-overlap metrics, with no significant change in evidence recall or refusal rate. | |
| In a cross-model comparison against five baselines (Gemma-3-12B, EXAONE-3.5-7.8B, Qwen-2.5-7B, Llama-3.1-8B, A.X-4.0-Light) on the same evaluation set, **KorDef-LLM achieves the highest mean evidence-token recall**, the metric most directly tied to source faithfulness in source-grounded QA. The train/eval overlap audit confirms zero exact question, zero exact answer, and zero near-question (Jaccard โฅ 0.80) overlap between the training corpus and the evaluation set. | |
| ## Known Limitations | |
| 1. **Effect sizes are modest.** The improvements over the base model on a source-grounded evaluation are statistically significant but small in absolute magnitude (~3 percentage points on Token-F1). The model is not a substitute for retrieval-augmented generation or human expert review. | |
| 2. **Evidence recall and refusal rate are not significantly improved.** While source-grounded inference shows favorable trends on these source-faithfulness metrics, none reach statistical significance against the base model. Source faithfulness in the deployed system should be enforced via retrieval grounding and explicit citation requirements. | |
| 3. **The training corpus is partially released.** Only the prompt-generated subset of the training corpus is publicly available via Zenodo. The full released corpus, source manifest, segments, and evaluation set are available; the model weights are released here. | |
| 4. **No human expert evaluation.** Evaluation was conducted using automatic metrics. Future deployments in any operational or educational context should be validated by qualified Korean defense doctrine experts. | |
| 5. **Defense-domain language specificity.** The model is tuned for Korean defense administrative-rule and educational text style. It may produce overly formal or excessively verbose responses outside this domain. | |
| 6. **Hallucination risk.** Like all large language models, KorDef-LLM may generate plausible-sounding but factually incorrect content, especially when asked about topics not covered by its training corpus or when source context is incomplete. | |
| ## Safety Considerations | |
| - **Dual-use awareness**: Defense-domain language modeling carries inherent dual-use considerations. The released model and corpus contain only publicly available administrative-rule and educational content, not operational, tactical, or classified information. | |
| - **Recommended deployment pattern**: For any real-world use, we recommend retrieval-augmented generation with explicit source citation, deployment within controlled (e.g., air-gapped) infrastructure, and human expert review of outputs in any consequential workflow. | |
| - **Memorization and data extraction**: The model has been trained on Korean defense administrative-rule text. While the training data is unclassified, users should still exercise caution regarding prompts that attempt to extract training data verbatim. | |
| - **Prompt injection**: As with all instruction-tuned LLMs, the model may be vulnerable to prompt-injection attacks in deployed agentic settings. Defensive measures (input sanitization, instruction layering, output filtering) are recommended. | |
| ## How to Use | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model_name = "graphuser/kordef-12b" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_name, | |
| torch_dtype="bfloat16", | |
| device_map={"": 0}, # single GPU; avoids CPU offload | |
| ) | |
| # Source-grounded prompting (recommended pattern) | |
| prompt = """๋ค์ [์ถ์ฒ]๋ฅผ ์ฐธ๊ณ ํ์ฌ [์ง๋ฌธ]์ ์ ํํ ๋ต๋ณํ์์ค. | |
| [์ถ์ฒ] | |
| (์ฌ๊ธฐ์ ๊ด๋ จ ํ์ ๊ท์น ๋๋ ๋ฌธ์ ๋ฐ์ท ์ฝ์ ) | |
| [์ง๋ฌธ] | |
| (์ฌ๊ธฐ์ ์ง๋ฌธ ์์ฑ) | |
| [๋ต๋ณ]""" | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=192, | |
| do_sample=False, | |
| repetition_penalty=1.05, | |
| ) | |
| print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)) | |
| ``` | |
| ## Citation | |
| If you use this model, please cite the paper and the dataset: | |
| ```bibtex | |
| @article{gwak2026kordef, | |
| title = {An Open Pipeline for Domain-Adaptive Instruction Tuning of Korean Defense Large Language Models}, | |
| author = {Gwak, Sang-Hwan and Choi, Ji-Young and Jeong, Chang-Hoo and Lee, Gunwoo and Kim, Ina and Lee, Kyung-Ha}, | |
| journal = {PeerJ Computer Science (submitted)}, | |
| year = {2026} | |
| } | |
| @dataset{kordef_corpus_2026, | |
| title = {KorDef-LLM: Korean Defense Domain Instruction Corpus and Source-Grounded Evaluation Set}, | |
| author = {Gwak, Sang-Hwan and others}, | |
| year = {2026}, | |
| publisher = {Zenodo}, | |
| doi = {10.5281/zenodo.20083055} | |
| } | |
| ``` | |
| ## License | |
| - **Model weights**: Gemma Terms of Use (the model is fine-tuned from `google/gemma-3-12b-it`). Users must comply with the [Gemma Terms](https://ai.google.dev/gemma/terms). | |
| - **Released corpus** (Zenodo): CC-BY-4.0 | |
| - **Code** (GitHub): MIT | |
| ## Acknowledgments | |
| This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT and DAPA) (No. RS-2024-00452972). | |
| ## Contact | |
| For questions about this model or the accompanying paper, please contact the corresponding author at `kyongha@kisti.re.kr` or open an issue on the [GitHub repository](https://github.com/gshwan22/KorDef-LLM). | |