Image-Text-to-Text
Transformers
Safetensors
Korean
gemma4
darwin
darwin-v8
korean
administrative-ai
public-sector
government
multimodal
reasoning
thinking
conversational
gpqa
benchmark
leaderboard
k-ai
k-ai-leaderboard
vidraft
jgos
text-generation
ffn-transfer
model-merge
Eval Results
Instructions to use JGOS-Model/JGOS-31B-Citizen with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JGOS-Model/JGOS-31B-Citizen with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="JGOS-Model/JGOS-31B-Citizen") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("JGOS-Model/JGOS-31B-Citizen") model = AutoModelForMultimodalLM.from_pretrained("JGOS-Model/JGOS-31B-Citizen") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use JGOS-Model/JGOS-31B-Citizen with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "JGOS-Model/JGOS-31B-Citizen" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JGOS-Model/JGOS-31B-Citizen", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/JGOS-Model/JGOS-31B-Citizen
- SGLang
How to use JGOS-Model/JGOS-31B-Citizen with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "JGOS-Model/JGOS-31B-Citizen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JGOS-Model/JGOS-31B-Citizen", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "JGOS-Model/JGOS-31B-Citizen" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JGOS-Model/JGOS-31B-Citizen", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use JGOS-Model/JGOS-31B-Citizen with Docker Model Runner:
docker model run hf.co/JGOS-Model/JGOS-31B-Citizen
feat: add Training Datasets section (AIHub K-AI optimized)
Browse files
README.md
CHANGED
|
@@ -70,6 +70,25 @@ JGOS-31B-Citizen is built on VIDRAFT's **Darwin V8** platform.
|
|
| 70 |
|----------------------------|-------|
|
| 71 |
| maj@8 + tie-retry + DELPHI + near-miss maj@32–64 (weighted vote) | **84.34%** (167/198) |
|
| 72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
## License
|
| 74 |
|
| 75 |
This model is built on a Gemma-family architecture and is distributed under the [**Gemma Terms of Use**](https://ai.google.dev/gemma/terms). By using this model, you agree to the Gemma license terms.
|
|
|
|
| 70 |
|----------------------------|-------|
|
| 71 |
| maj@8 + tie-retry + DELPHI + near-miss maj@32–64 (weighted vote) | **84.34%** (167/198) |
|
| 72 |
|
| 73 |
+
|
| 74 |
+
## Training Datasets
|
| 75 |
+
|
| 76 |
+
JGOS-31B-Citizen was trained using large-scale Korean corpora sourced from the **Korean AI Hub (AIHub)** — Korea's national AI data repository operated by NIA (National Intelligence Agency for IT). The following datasets were used to optimize performance on the **K-AI Leaderboard** benchmarks (KoMMLU-Pro, CliCK, HLE, MuSR, Com2):
|
| 77 |
+
|
| 78 |
+
| # | Dataset Name | AIHub Link |
|
| 79 |
+
|---|---|---|
|
| 80 |
+
| 1 | Medical and Legal Professional Books Corpus | [71487](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=71487) |
|
| 81 |
+
| 2 | Financial and Legal Document Machine Reading Comprehension | [71610](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=71610) |
|
| 82 |
+
| 3 | Large-scale Web-based Korean Corpus | [624](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=624) |
|
| 83 |
+
| 4 | Large-scale Book-based Korean Corpus | [653](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=653) |
|
| 84 |
+
| 5 | National Records Large-scale AI Learning Corpus | [71788](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=71788) |
|
| 85 |
+
| 6 | Korean Generation-based Common Sense Reasoning Dataset | [459](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=459) |
|
| 86 |
+
| 7 | Multi-session Dialogue Corpus | [pkg1](https://aihub.or.kr/aihubdata/data/view.do?currMenu=511&topMenu=100&aihubDataSe=dataPckage&dataPckageSn=1) |
|
| 87 |
+
| 8 | Essential Medical Knowledge Data (142GB) | [71875](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=71875) |
|
| 88 |
+
| 9 | Specialized Medical Knowledge Data (206GB) | [71874](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=71874) |
|
| 89 |
+
| 10 | Korean Dialogue Dataset | [272](https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=272) |
|
| 90 |
+
|
| 91 |
+
> All datasets are publicly available via [AIHub](https://aihub.or.kr) (registration required).
|
| 92 |
## License
|
| 93 |
|
| 94 |
This model is built on a Gemma-family architecture and is distributed under the [**Gemma Terms of Use**](https://ai.google.dev/gemma/terms). By using this model, you agree to the Gemma license terms.
|