Instructions to use JDONE-Research/AIOne-Agent-52B-A36B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JDONE-Research/AIOne-Agent-52B-A36B-it with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="JDONE-Research/AIOne-Agent-52B-A36B-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("JDONE-Research/AIOne-Agent-52B-A36B-it") model = AutoModelForImageTextToText.from_pretrained("JDONE-Research/AIOne-Agent-52B-A36B-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use JDONE-Research/AIOne-Agent-52B-A36B-it with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "JDONE-Research/AIOne-Agent-52B-A36B-it" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JDONE-Research/AIOne-Agent-52B-A36B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/JDONE-Research/AIOne-Agent-52B-A36B-it
- SGLang
How to use JDONE-Research/AIOne-Agent-52B-A36B-it with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "JDONE-Research/AIOne-Agent-52B-A36B-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JDONE-Research/AIOne-Agent-52B-A36B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "JDONE-Research/AIOne-Agent-52B-A36B-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "JDONE-Research/AIOne-Agent-52B-A36B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use JDONE-Research/AIOne-Agent-52B-A36B-it with Docker Model Runner:
docker model run hf.co/JDONE-Research/AIOne-Agent-52B-A36B-it
AIOne-Agent-52B-A36B-it
A 52B / A36B sparse Mixture-of-Experts multimodal model for Korean reasoning, image understanding, and video understanding.
Model Description
AIOne-Agent-52B-A36B-it is a Korean-tuned multimodal Mixture-of-Experts (MoE) model based on Gemma 4 31B IT. The model retains the full text + image + video capabilities of the base Gemma 4 family and adds a Korean-domain MoE branch that activates the right experts for the input on the fly.
- Multimodal. Accepts text, images, and video; produces fluent Korean (and English) responses.
- Sparse MoE (
top_k=2of8experts) with always-on dense shared MLP. ~36 B parameters are active per token in the text backbone, while the full text backbone holds ~52 B parameters worth of capacity. - Long context. 256K tokens, inherited from the base model.
The name follows the Gemma 4 convention (google/gemma-4-26B-A4B-it): the first number is the text backbone parameter count, A{X}B is the per-token active parameter count, and the vision encoder (0.57 B) is reported separately.
Key Capabilities
- Korean reasoning and instruction following.
- Image understanding (caption, VQA, document understanding).
- Video understanding (frame-by-frame reasoning).
- Long-context document QA in Korean.
- Bilingual: Korean (primary) + English.
Quick Start
Transformers
import torch
from transformers import AutoProcessor, Gemma4ForConditionalGeneration
MODEL_ID = "JDONE-Research/AIOne-Agent-52B-A36B-it"
model = Gemma4ForConditionalGeneration.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
)
processor = AutoProcessor.from_pretrained(MODEL_ID)
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": "file:///path/to/image.jpg"},
{"type": "text", "text": "이 사진에 무엇이 보이나요? 한국어로 답해주세요."},
],
},
]
inputs = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_tensors="pt",
return_dict=True,
).to(model.device)
generated = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(
processor.tokenizer.decode(
generated[0, inputs.input_ids.shape[1]:], skip_special_tokens=True
)
)
Text-only
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "사과 3개와 배 5개의 가격이 12,000원입니다. "
"사과 1개가 1,500원이라면 배 1개 가격은? 단계적으로 풀이해주세요.",
},
],
},
]
vLLM (recommended for serving)
vllm serve JDONE-Research/AIOne-Agent-52B-A36B-it \
--dtype bfloat16 \
--tensor-parallel-size 4 \
--max-model-len 32768
Sample Output
Korean math reasoning (text-only)
단계별 풀이 과정은 다음과 같습니다.
1단계: 사과 3개의 전체 가격 구하기 사과 1개의 가격이 1,500원이므로, 3개의 가격을 계산합니다.
- 1,500원 × 3개 = 4,500원
2단계: 배 5개의 전체 가격 구하기 전체 금액(12,000원)에서 사과 3개의 가격(4,500원)을 빼면 배 5개의 전체 가격이 나옵니다.
Multimodal (image + Korean caption)
다양한 색상의 점들이 섞여 무지개 빛깔의 그라데이션을 이루고 있는 이미지입니다.
Model Specs
| Field | Value |
|---|---|
| Architecture | Gemma4ForConditionalGeneration |
| Base model | google/gemma-4-31B-it |
| Text backbone parameters | 51.51 B → 52 B (in name) |
| Active parameters per token (text) | 35.90 B → A36B (in name) (dense MLP always on + top-2 of 8 experts + attention) |
| Vision tower | 0.57 B (SigLIP-style, 27 layers) |
| MM projector | 0.01 B |
| Total weights on disk | 52.09 B / ~104 GB (BF16) |
| MoE config | num_experts=8, top_k=2, moe_intermediate_size=2688 |
| Modality | Text + Image + Video → Text |
| Precision | bfloat16 |
| Context length | 256K |
| Languages | Korean (primary), English |
Intended Use
- Korean enterprise agent backend (long-context tool use, RAG, multi-turn reasoning).
- Image and video understanding with Korean output.
- Document QA in Korean.
Out-of-Scope Use
- Sole-source decision-making with legal consequences.
- Automated use of force or coercive control based purely on this model's output.
- Any media analysis that infringes on personal privacy, image rights, or applicable data-protection laws.
License
This model is released under the Apache License 2.0 license.
- Commercial use, redistribution, and modification are permitted with attribution.
- Provided "as is" without warranties or conditions of any kind.
Citation
@misc{aione_agent_52b_a36b_it,
title = {AIOne-Agent-52B-A36B-it: A Korean Sparse-MoE Multimodal Model},
author = {JDONE Research},
year = {2026},
howpublished = {\url{https://huggingface.co/JDONE-Research/AIOne-Agent-52B-A36B-it}}
}
- Downloads last month
- 96