Gemma3 MoE

This repository provides a Mixture-of-Experts (MoE) variant of Gemma-3, where the standard FFN layers are replaced with a top-k gated MoE architecture.

โš ๏ธ This model requires trust_remote_code=True because it uses a custom configuration and model implementation (gemma3_moe).

This model is intended for research and experimentation, with a focus on:

  • Korean reasoning and reading comprehension
  • Fact-checking and factual consistency judgment
  • Multilingual instruction-following
  • Analysis of MoE routing and expert specialization

Model Overview

  • Base model: Gemma-3
  • Architecture: Decoder-only Transformer with MoE FFN
  • Experts per layer: Configurable (e.g. 8 experts)
  • Routing: Top-k soft routing
  • Auxiliary loss: Router load-balancing loss
  • Framework: PyTorch
  • Model format: safetensors

Architecture Details

  • Each Transformer FFN block is replaced with an MoE layer
  • A lightweight router maps hidden states to expert logits
  • Top-k experts are selected per token
  • Expert outputs are merged via weighted summation
  • Router logits are retained for auxiliary balancing loss computation

This design enables implicit expert specialization across:

  • Logical and step-by-step reasoning
  • Factual QA and verification
  • Creative and conversational responses
  • Code-related patterns

Training Data

The model was fine-tuned using a streaming-based mixed SFT dataset, explicitly balanced across languages and task types.

Language Distribution

  • Korean: ~55%
  • English: ~35%
  • Code: ~10%

Datasets were interleaved with fixed probabilities and shuffled using a streaming buffer.


English Instruction Data (35%)

  • Open-Orca / OpenOrca
    • Subset: ~50K samples
    • General instruction-following
    • Multi-step reasoning and explanation tasks

Korean Instruction Data (55%)

Korean data was diversified with a strong emphasis on reasoning accuracy and factual correctness.

Category Dataset Purpose Weight
Logic / Reasoning kyujinpy/KOpen-platypus Step-by-step reasoning 20%
QA kikikara/ko_QA_dataset General QA 20%
MRC Custom iterable Reading comprehension 25%
Fact-checking Custom iterable Factual verification / hallucination reduction 20%
Creative beomi/KoAlpaca-v1.1a Conversational & creative tasks 15%

๐Ÿ“Œ Fact-checking data was intentionally up-weighted to strengthen factual grounding.


Code Instruction Data (10%)

  • sahil2801 / CodeAlpaca-20k
    • Subset: ~15K samples
    • Code generation and explanation tasks

Data Processing

  • Unified SFT formatting across all datasets
  • Streaming mode to reduce memory usage
  • Final shuffle buffer size: 10,000
  • Fixed-probability interleaving to preserve domain balance

Training Strategy

This model was trained with parameter-groupโ€“specific optimization, explicitly separating shared parameters, expert FFNs, and router parameters.

Different learning rates and regularization settings were applied to encourage stable MoE specialization and balanced routing behavior.

Optimizer Parameter Groups

def get_moe_param_groups(model):
    attention_params = []
    expert_params = []
    router_params = []

    for name, param in model.named_parameters():
        if not param.requires_grad:
            continue

        # Router parameters
        if "mlp.router" in name:
            router_params.append(param)

        # Expert FFN parameters
        elif "mlp.experts" in name:
            expert_params.append(param)

        # Attention, embeddings, layer norms, etc.
        else:
            attention_params.append(param)

    return [
        {"params": attention_params, "lr": 2e-6, "weight_decay": 0.0},
        {"params": expert_params,    "lr": 1e-5, "weight_decay": 0.1},
        {"params": router_params,    "lr": 2e-5, "weight_decay": 0.0},
    ]

Fact-check Test

User Prompt

๋‹ค์Œ ์š”์•ฝ๋ฌธ์ด ์›๋ฌธ์— ๋น„์ถ”์–ด ์‚ฌ์‹ค์ธ์ง€ ํŒ๋‹จํ•˜์‹œ์˜ค. ์‚ฌ์‹ค ์—ฌ๋ถ€์™€ ํŒ๋‹จ ๊ทผ๊ฑฐ๋ฅผ ํ•จ๊ป˜ ์ œ์‹œํ•˜์‹œ์˜ค.

[์›๋ฌธ] ์˜ฌํ•ด๋ถ€ํ„ฐ ๊ฐ€๋งน๋ณธ๋ถ€ ์ •๋ณด๊ณต๊ฐœ์„œ ๋“ฑ๋ก์ด ์ˆ˜๊ฐœ์›”์—์„œ 30์ผ ์ด๋‚ด๋กœ ๋‹จ์ถ•๋  ์ „๋ง์ด๋‹ค. ์„œ์šธ์‹œ๋Š” ์˜ฌํ•ด 1์›”1์ผ๋ถ€ํ„ฐ ๊ฐ€๋งน๋ณด๋ถ€ ์ •๋ณด๊ณต๊ฐœ์„œ ๋“ฑ๋ก์—…๋ฌด๋ฅผ ๊ณต์ •๊ฑฐ๋ž˜์œ„์›ํšŒ๋กœ๋ถ€ํ„ฐ ์ด์–‘๋ฐ›์•„ ์„œ์šธ, ์ธ์ฒœ, ๊ฒฝ๊ธฐ ๋“ฑ 3๊ฐœ ์ง€์ž์ฒด๊ฐ€ ์ฒ˜๋ฆฌํ•œ๋‹ค๊ณ  16์ผ ๋ฐํ˜”๋‹ค. ๊ฐ€๋งน์ •๋ณด๊ณต๊ฐœ์„œ๋Š” ๊ฐ€๋งน์  ์ฐฝ์—… ํฌ๋ง์ž๊ฐ€ ๊ณ„์•ฝ์— ์•ž์„œ ๊ฐ€๋งน๋ณธ๋ถ€์˜ ์ •๋ณด๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์„œ๋กœ, ๊ณ„์•ฝ์ฒด๊ฒฐ ์—ฌ๋ถ€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐ ํฐ ์—ญํ• ์„ ํ•œ๋‹ค. ๊ธฐ์กด์— ๊ฐ€๋งน์ •๋ณด๊ณต๊ฐœ์„œ ๋“ฑ๋ก์ด ๊ณต์ •์œ„์—์„œ๋งŒ ๊ฐ€๋Šฅํ•ด ๊ธธ๊ฒŒ๋Š” ์ˆ˜๊ฐœ์›”์‹ ์†Œ์š”๋๋‹ค. ํ•˜์ง€๋งŒ ์˜ฌํ•ด ์ง€์ž์ฒด์™€์˜ ์—…๋ฌด ๋ถ„๋‹ด์œผ๋กœ ๋“ฑ๋ก๊ธฐ๊ฐ„์ด 30์ผ ์ด๋‚ด๋กœ ๋‹จ์ถ•๋œ๋‹ค. ์„œ์šธ์‹œ๋Š” ๋ณธ์‚ฌ์˜ ํ˜„ํ™ฉ์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋Š” ์žฌ๋ฌด์‚ฌํ•ญ, ํˆฌ์ž์ˆ˜์ต๋ฅ  ๋“ฑ์ด ํฌํ•จ๋ผ ์žˆ๋Š” ์ž๋ฃŒ์ด๋‹ˆ๋งŒํผ ์ข€ ๋” ๊ผผ๊ผผํ•˜๊ฒŒ ์ฒด๊ณ„์ ์œผ๋กœ ์‹ฌ์‚ฌํ•˜๊ณ  ๋“ฑ๋กํ•ด ๊ฐ€๋งน์  ์ฐฝ์—… ํฌ๋ง์ž๋“ค์ด ์ฐฝ์—…์— ๋Œ€ํ•œ ํ•ฉ๋ฆฌ์ ์ด๊ณ  ์‹ ์ค‘ํ•œ ๊ฒฐ์ •์„ ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•๊ฒ ๋‹ค๊ณ  ๋ฐํ˜”๋‹ค. ์„œ์šธ์†Œ์žฌ ๊ฐ€๋งน๋ณธ๋ถ€ ์ค‘ ์ •๋ณด๊ณต๊ฐœ์„œ ๋“ฑ๋ก์„ ์›ํ•˜๋Š” ์—…์ฒด๋Š” ์„œ์šธ์‹œ ๊ณต์ •๊ฒฝ์ œ๋‹ด๋‹น๊ด€ ๊ฐ€๋งน์ •๋ณดํŒ€์œผ๋กœ ์šฐํŽธ ๋˜๋Š” ๋ฐฉ๋ฌธ์ ‘์ˆ˜(์ค‘๊ตฌ ๋ฌด๊ต๋กœ 21 ์„œ์šธ์‹œ์ฒญ ๋ฌด๊ต๋ณ„๊ด€ 8์ธต)ํ•˜๊ฑฐ๋‚˜ ๊ณต์ •๊ฑฐ๋ž˜์œ„์›ํšŒ ๊ฐ€๋งน์‚ฌ์—…๊ฑฐ๋ž˜ ํ™ˆํŽ˜์ด์ง€๋กœ ์‹ ์ฒญํ•˜๋ฉด ๋œ๋‹ค. ํ•œํŽธ ์‹œ๋Š” ์˜ค๋Š” 18์ผ ์˜คํ›„ 2์‹œ ์„œ์šธ์‹œ์ฒญ ๋‹ค๋ชฉ์ ํ™€์—์„œ ๊ฐ€๋งน๋ณธ๋ถ€ ๋Œ€์ƒ ์ •๋ณด๊ณต๊ฐœ์„œ ๋“ฑ๋ก์—…๋ฌด ์„ค๋ช…ํšŒ๋ฅผ ๊ฐœ์ตœํ•œ๋‹ค.

[์š”์•ฝ๋ฌธ] ์„œ์šธ์‹œ๋Š” ์˜ฌํ•ด 1์›”1์ผ๋ถ€ํ„ฐ ๊ฐ€๋งน๋ณด๋ถ€ ์ •๋ณด๊ณต๊ฐœ์„œ ๋“ฑ๋ก์—…๋ฌด๋ฅผ ๊ณต์ •๊ฑฐ๋ž˜์œ„์›ํšŒ๋กœ๋ถ€ํ„ฐ ์ด์–‘๋ฐ›์•„ ์„œ์šธ, ์ธ์ฒœ, ๊ฒฝ๊ธฐ ๋“ฑ 3๊ฐœ ์ง€์ž์ฒด๊ฐ€ ์ฒ˜๋ฆฌํ•œ๋‹ค๊ณ  16์ผ ๋ฐํ˜”์œผ๋ฉฐ, ๊ฐ€๋งน์ •๋ณด๊ณต๊ฐœ์„œ๋Š” ๊ฐ€๋งน์  ์ฐฝ์—… ํฌ๋ง์ž๊ฐ€ ๊ณ„์•ฝ์— ์•ž์„œ ๊ฐ€๋งน๋ณธ๋ถ€์˜ ์ •๋ณด๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์„œ๋กœ, ๊ณ„์•ฝ์ฒด๊ฒฐ ์—ฌ๋ถ€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐ ํฐ ์—ญํ• ์„ ํ•œ๋‹ค.

Test Result (Model Output)

ํŒ๋‹จ: ์‚ฌ์‹ค

์˜ค๋ฅ˜ ์œ ํ˜•: ํ•œ๊ธ€ ๋งž์ถค๋ฒ•, ๋„์–ด์“ฐ๊ธฐ ์˜ค๋ฅ˜

๊ทผ๊ฑฐ: ์š”์•ฝ๋ฌธ์—๋Š” '1์›”1์ผ๋ถ€ํ„ฐ'๋ผ๋Š” ํ‘œํ˜„์ด ์žˆ์œผ๋ฉฐ, ์ด๋Š” '1์›” 1์ผ๋ถ€ํ„ฐ'๋กœ ์ˆ˜์ •๋˜์–ด์•ผ ํ•œ๋‹ค. ํ•ด๋‹น ์˜ค๋ฅ˜๋Š” [1] ํ•œ๊ธ€ ๋งž์ถค๋ฒ•, ๋„์–ด์“ฐ๊ธฐ ์˜ค๋ฅ˜์— ํ•ด๋‹นํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ์˜ค๋ฅ˜๋Š” ์˜๋ฏธ๋‚˜ ์‚ฌ์‹ค ๊ด€๊ณ„์— ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š๋Š”๋‹ค.

QA Test

User Prompt

์•„๋ฅด์ œ ์••์ถ• ํ”„๋กœ๊ทธ๋žจ์˜ ๋ณต์› ๊ธฐ๋Šฅ๊ณผ ๋‹ค์ค‘๋ณผ๋ฅจ ์••์ถ• ๊ธฐ๋Šฅ์€ ๋ฌด์—‡์ธ๊ฐ€์š”?

Test Result (Model Output)

์•„๋ฅด์ œ ์••์ถ• ํ”„๋กœ๊ทธ๋žจ์€ ๋ณต์› ๊ธฐ๋Šฅ๊ณผ ๋‹ค์ค‘๋ณผ๋ฅจ ์••์ถ• ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ณต์› ๊ธฐ๋Šฅ์€ ์†์ƒ๋œ ์••์ถ• ํŒŒ์ผ์„ ๋ณต์›ํ•˜์—ฌ ์›๋ž˜์˜ ์••์ถ• ํŒŒ์ผ๋กœ ๋ณต์›ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๋ฉฐ, ๋‹ค์ค‘๋ณผ๋ฅจ ์••์ถ• ๊ธฐ๋Šฅ์€ ์••์ถ• ํŒŒ์ผ์„ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž‘์€ ๋ณผ๋ฅจ์œผ๋กœ ๋‚˜๋ˆ„๊ณ , ๊ฐ ๋ณผ๋ฅจ์— ๋Œ€ํ•ด ๋‹ค๋ฅธ ์••์ถ• ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์—ฌ ์••์ถ•๋ฅ ์„ ๋†’์ผ ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ฐœ์ธ์ ์œผ๋กœ, ์ด๋Ÿฌํ•œ ๊ธฐ๋Šฅ๋“ค์€ ์••์ถ• ํŒŒ์ผ์˜ ์†์ƒ ๋ณต์›๊ณผ ํšจ์œจ์ ์ธ ์••์ถ•์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ด์ฃผ์–ด ๋งค์šฐ ์œ ์šฉํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

MRC Test

User Prompt

[๋‰ด์Šค/๋ณธ๋ฌธ] ํ•œ๊ตญ์ฒญ์†Œ๋…„๋‹จ์ฒดํ˜‘์˜ํšŒ์™€ ์—ฌ์„ฑ๊ฐ€์กฑ๋ถ€๋Š” 22์ผ๋ถ€ํ„ฐ 28์ผ๊นŒ์ง€ ์„œ์šธ๊ณผ ์ถฉ๋ถ ๊ดด์‚ฐ์—์„œ '๊ตญ์ œ์ฒญ์†Œ๋…„ํฌ๋Ÿผ'์„ ์—ฐ๋‹ค๊ณ  21์ผ ๋ฐํ˜”๋‹ค. ํ•œ๊ตญ ๋ฏธ๊ตญ ์บ๋‚˜๋‹ค ํ˜ธ์ฃผ ๋“ฑ ์ „ ์„ธ๊ณ„ 32๊ฐœ๊ตญ 75์—ฌ๋ช…์˜ ๋Œ€ํ•™์ƒ, ์ฒญ์†Œ๋…„๋“ค์ด ๋ชจ์—ฌ ์ „ ์„ธ๊ณ„์  ํ˜„์•ˆ๋ฌธ์ œ์— ๋Œ€ํ•œ ๋Œ€์•ˆ๊ณผ ํ•ด๊ฒฐ์ฑ…์„ ๋ชจ์ƒ‰ํ•˜๋Š” ์ž๋ฆฌ๋‹ค. ์ด๋ฒˆ ํฌ๋Ÿผ์˜ ์ฃผ์ œ๋Š” '์ฒญ์†Œ๋…„๊ณผ ๋‰ด๋ฏธ๋””์–ด'๋‹ค. ์Šค๋งˆํŠธํฐ SNS ํƒœ๋ธ”๋ฆฟPC ๋“ฑ ์ƒˆ๋กœ์šด ์ปค๋ฎค๋‹ˆ์ผ€์ด์…˜ ๋งค์ฒด์ธ '๋‰ด๋ฏธ๋””์–ด'์— ๋Œ€ํ•œ ์„ฑ์ฐฐ๊ณผ ๋ฌธ์ œ์ ์— ๋Œ€ํ•ด ํ† ๋ก ํ•œ๋‹ค. ๊ธฐ์กฐ๊ฐ•์—ฐ์„ ์‹œ์ž‘์œผ๋กœ ๊ตญ๊ฐ€๋ณ„ ์ฃผ์ œ๊ด€๋ จ ์‚ฌ๋ก€๋ฐœํ‘œ, ๊ทธ๋ฃน ํ† ๋ก  ๋ฐ ์ „์ฒด์ดํšŒ, '์ฒญ์†Œ๋…„์„ ์–ธ๋ฌธ' ์ž‘์„ฑ ๋ฐ ์ฑ„ํƒ ๋“ฑ ๋‹ค์–‘ํ•œ ํ”„๋กœ๊ทธ๋žจ์„ ์šด์˜ํ•œ๋‹ค. ๊ฐœํšŒ์‹์€ 22์ผ ์„œ์šธ ๋ฐฉํ™”๋™์— ์žˆ๋Š” ๊ตญ์ œ์ฒญ์†Œ๋…„์„ผํ„ฐ ๊ตญ์ œํšŒ์˜์žฅ์—์„œ ํ•œ๋‹ค. ์ „ ์„ธ๊ณ„ 32๊ฐœ๊ตญ ๋Œ€ํ•™์ƒใ†์ฒญ์†Œ๋…„ ์ฐธ๊ฐ€์ž์™€ ์ „๊ตญ์˜ ์ฒญ์†Œ๋…„๊ธฐ๊ด€๋‹จ์ฒด์žฅ๊ณผ ์ฒญ์†Œ๋…„์ง€๋„์ž ์—ฌ์„ฑ๊ฐ€์กฑ๋ถ€ ์ฃผํ•œ์™ธ๊ต์‚ฌ์ ˆ ๋“ฑ 100์—ฌ๋ช…์ด ์ฐธ์„ํ•  ์˜ˆ์ •์ด๋‹ค. 23์ผ์—๋Š” ์œ ์—”๋ฏธ๋ž˜ํฌ๋Ÿผ ๋ฐ•์˜์ˆ™ ๋Œ€ํ‘œ๊ฐ€ '๋‰ด๋ฏธ๋””์–ด์˜ ๊ท ํ˜• ์žˆ๋Š” ๋ฐœ์ „์„ ์œ„ํ•œ ์ฒญ์†Œ๋…„์˜ ์—ญํ• '์— ๋Œ€ํ•ด ๊ธฐ์กฐ๊ฐ•์—ฐ์„ ํ•œ๋‹ค. ๋‰ด๋ฏธ๋””์–ด์˜ ์˜ฌ๋ฐ”๋ฅธ ํ™œ์šฉ๋ฐฉ์•ˆ๊ณผ ์ฒญ์†Œ๋…„๋ฌธํ™”์˜ ํ˜•์„ฑ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•  ๊ณ„ํš์ด๋‹ค. 27์ผ ํํšŒ์‹์—์„œ๋Š” '์ฒญ์†Œ๋…„์„ ์–ธ๋ฌธ'์„ ์ฑ„ํƒํ•œ๋‹ค. ์„ ์–ธ๋ฌธ์—๋Š” ์ „ ์„ธ๊ณ„์ ์œผ๋กœ ๋‰ด๋ฏธ๋””์–ด์˜ ๋ฐ”๋žŒ์งํ•œ ๋ฐœ์ „์„ ์ด‰๊ตฌํ•˜๋ฉฐ ๊ฐ๊ตญ ์ฒญ๋…„๋“ค์ด ํ•จ๊ป˜ ์‹ค์ฒœํ•  ์ˆ˜ ์žˆ๋Š” ๋‚ด์šฉ ๋“ฑ์ด ๋‹ด๊ธธ ์˜ˆ์ •์ด๋‹ค. ํ•œ๊ตญ์ฒญ์†Œ๋…„๋‹จ์ฒดํ˜‘์˜ํšŒ๋Š” ํฌ๋Ÿผ์ด ๋๋‚œ ๋’ค UN ๋“ฑ ๊ตญ์ œ๊ธฐ๊ตฌ์™€ ์ฐธ๊ฐ€์ž ๊ฐ๊ตญ ์ •๋ถ€ ๋“ฑ ๊ตญ์ œ์‚ฌํšŒ์— ์„ ์–ธ๋ฌธ์„ ์ „๋‹ฌํ•  ์˜ˆ์ •์ด๋‹ค.

[์งˆ์˜] ์„œ์šธ๊ณผ ์ถฉ๋ถ ๊ดด์‚ฐ์—์„œ '๊ตญ์ œ์ฒญ์†Œ๋…„ํฌ๋Ÿผ'์„ ์—ฌ๋Š” ๊ณณ์€?

Test Result (Model Output)

ํ•œ๊ตญ์ฒญ์†Œ๋…„๋‹จ์ฒดํ˜‘์˜ํšŒ์™€ ์—ฌ์„ฑ๊ฐ€์กฑ๋ถ€

CODE Test

User Prompt

Create an array of length 5 which contains all even numbers between 1 and 10.

Test Result (Model Output)

arr = [2, 4, 6, 8, 10]


Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "chyungwon/police-report-analysis-model-MoE-3X8",
    trust_remote_code=True,
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained(
    "chyungwon/police-report-analysis-model-MoE-3X8",
    trust_remote_code=True,
)

inputs = tokenizer("MoE(Mixture-of-Experts) ๋ชจ๋ธ์ด ๋จธ์•ผ?", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=64)

print('out : ', tokenizer.decode(outputs[0], skip_special_tokens=True))

out : ๊ฐœ์ธ์ ์œผ๋กœ, MoE(Mixture-of-Experts) ๋ชจ๋ธ์€ ๋ณต์žกํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ๋งค์šฐ ํšจ๊ณผ์ ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ, ๋ชจ๋ธ์˜ ๋ณต์žก์„ฑ๊ณผ ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ์–‘์ด ๋งŽ์€ ๋‹จ์ ๋„ ๊ณ ๋ คํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

Model Card Authors

(์ฃผ)์ธ์ •๋ณด
ํ™ˆํŽ˜์ด์ง€ : http://www.ijbinfo.com

์ •๋ณดํ†ต์‹ ์‚ฐ์—…์ง„ํฅ์›์˜ ์ง€์›์„ ๋ฐ›์•„์„œ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.

Model Card Contact

(์ฃผ)์ธ์ •๋ณด
์ฃผ์†Œ : ์„œ์šธ์‹œ ๊ธˆ์ฒœ๊ตฌ ๊ฐ€์‚ฐ๋™ 60-5 ๊ฐ‘์„๊ทธ๋ ˆ์ดํŠธ๋ฐธ๋ฆฌA๋™ 805ํ˜ธ
์—ฐ๋ฝ์ฒ˜ : TEL : 02-3397-7765 FAX : 02-3397-7769 E-mail : sales@injungbo.co.kr
๋‹ด๋‹น์ž : ์žฅํ˜•์›(chyungwon@ijbinfo.com)
Downloads last month
8
Safetensors
Model size
5B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support