Learning Unit 1

L1 (Learning Unit 1) is the first language model from Lunit and Lunit Consortium, purpose-built for the medical domain. Derived from Gravity-16B-A3B-Base, L1 is designed for clinical reasoning and decision support.

✨ Key Highlights

🩺 Medical-Domain Specialized: Developed specifically for clinical reasoning and medical decision support
⚡ Efficient MoE: Only 3B parameters active per token out of 16.24B total — fast inference with high capacity
💭 Thinking Model: Performs step-by-step reasoning in <think> tags before generating the final answer

Note: L1 reasons internally using <think>...</think> blocks before producing a response. This chain-of-thought process improves answer quality but consumes additional tokens. Set max_tokens accordingly (recommended: 2048+).

📋 Model Specifications

Type: Causal Language Model
Base Model: Gravity-16B-A3B-Base from Trillion Labs and Lunit Consortium
Architecture: GravityMoE (Sparse Mixture-of-Experts with MLA)
Total Parameters: 16.24B
Active Parameters: 3B
Number of Layers: 28
Attention Heads: 16
KV Heads: 16
Hidden Size: 2048
MoE Intermediate Size: 1408
Routed Experts: 64 (top-8 selection)
Shared Experts: 1
Context Length: 32,768 tokens
Vocabulary Size: 151,552
Tokenizer: GLM-4.5
Precision: bf16

🚀 Quickstart

SGLang (Recommended)

Install:

pip install "sglang[all] @ git+https://github.com/trillion-labs/sglang-gravity.git#subdirectory=python"

Launch server:

python -m sglang.launch_server \
  --model-path learning-unit/L1-16B-A3B \
  --port 9006 --host 0.0.0.0 \
  --tp 1 --dtype bfloat16 --trust-remote-code \
  --attention-backend triton \
  --moe-runner-backend triton

Query:

curl -X POST http://localhost:9006/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "learning-unit/L1-16B-A3B",
    "messages": [
      {"role": "user", "content": "What are the diagnostic criteria for sepsis?"}
    ],
    "max_tokens": 2048
  }'

Transformers

Install:

pip install "transformers>=5.0" torch

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "learning-unit/L1-16B-A3B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

messages = [
    {"role": "user", "content": "What are the diagnostic criteria for sepsis?"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048,
    temperature=0.7,
    do_sample=True,
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

💬 Examples

L1 is specialized for the medical domain and covers a wide range of clinical scenarios. Below are representative examples from real-world clinical use cases.

Medical Q&A

A 45-year-old woman with lupus nephritis on mycophenolate and prednisone develops fever, dry cough, and bilateral ground-glass opacities on chest CT. Her CD4 count is 180. What is your differential diagnosis and recommended workup?

Patient Education

I have diabetes and use insulin daily. What is the proper way to store insulin at home?

Clinical Documentation

Please draft an overnight progress note. Patient labs: RBC 4.5, WBC 8. Vitals: HR 82, BP 118/76, RR 15, Temp 37.1. Nurse reports stable overnight. Plan: continue antibiotics, recheck labs in the morning.

Emergency Triage

다음 응급실 환자에 대해 KTAS triage를 수행하고, 초기 진단 및 감별진단을 제시해주세요. 78세 여성 환자가 119 구급차로 응급실에 내원했습니다. 22시경 갑자기 좌측 안면이 처지고 말이 어눌해지는 증상이 발생했습니다. 두통을 호소하며, 고혈압 병력이 있습니다. 활력징후는 혈압 172/88, 심박수 92, 호흡수 14, 체온 36.8, 산소포화도 98%이고 의식은 명료합니다. 사지 위약감은 없습니다.

Adverse Drug Reaction (ADR) Causality Assessment

다음 환자의 약물이상반응(ADR)에 대해 WHO-UMC 기준으로 인과관계를 평가해주세요. 80세 여성 환자가 기관지확장증으로 입원 중 moxifloxacin 400mg IV를 투여받았습니다. 투여 중 전신 피부 가려움이 새로 발생했고, 약물 중단 후 환자 본인도 가려움이 줄어드는 양상을 표현했으며 이후 회복되었습니다. 재투여는 시행하지 않았습니다. 기존 약물 알레르기력은 없고, 가려움을 유발할 만한 다른 병용약물이나 피부질환은 확인되지 않았습니다.

📊 Benchmark

All benchmarks were evaluated using CoEval, Lunit's open-source medical LLM evaluation framework. Evaluations use greedy decoding (temperature=0). To reproduce these results:

git clone https://github.com/lunit-io/CoEval.git
cd CoEval

Refer to the CoEval Quickstart for setup and evaluation instructions.

MCQA Benchmarks

Model	PubMedQA	AttrBench	MedQA	CareQA	HeadQA	MedMCQA	MMLU-Pro (Health)	M-ARC	MetaMedQA	MedHallu	MedCalc	MedBullets 4-opt	MedBullets 5-opt	MedXpertQA-R	MedXpertQA-U	W.Avg
GPT-OSS-120B	78.00	76.10	91.10	91.00	88.40	74.80	74.60	40.00	76.50	83.50	30.30	84.70	82.10	35.60	32.90	79.43
GPT-OSS-20B	75.80	74.80	83.90	84.80	83.30	65.40	70.50	31.00	70.10	81.30	29.20	73.40	70.50	24.70	21.20	73.38
Qwen3.5-122B	76.40	55.68	87.80	86.40	84.00	74.40	73.00	59.00	73.90	37.50	53.70	79.20	79.50	35.90	35.30	75.08
MedGemma-27B	73.40	74.80	84.40	85.00	83.80	71.90	73.00	48.00	69.60	81.40	24.10	73.70	68.80	19.10	20.50	73.99
Gemma4-26B-A4B	76.40	72.00	81.80	84.50	82.30	67.30	73.50	67.00	71.50	86.50	45.60	73.70	67.50	45.10	39.20	75.34
L1-16B-A3B	84.20	78.40	85.50	88.20	85.80	76.70	74.90	82.00	73.10	76.10	43.90	78.90	70.80	27.50	29.20	77.74

Chat Task

Model	HealthBench-Consensus
GPT-OSS-120B	90.60
GPT-OSS-20B	78.70
Qwen3.5-122B	92.20
MedGemma-27B	90.70
Gemma4-26B-A4B	92.60
L1-16B-A3B	93.50

📝 Citation

@misc{lunit2026l1,
  title={L1: The First Clinical Language Model by Lunit},
  author={Lunit},
  year={2026},
  url={https://huggingface.co/learning-unit/L1-16B-A3B}
}

⚠️ Limitations

Not a substitute for professional medical judgment. L1 may generate factually incorrect, incomplete, or outdated clinical information. All outputs should be verified by qualified healthcare professionals.
Thinking overhead. Chain-of-thought reasoning in <think> tags increases token consumption and latency compared to non-thinking models of similar size.
Context length. Maximum context length is 32,768 tokens.
No real-time knowledge. The model's knowledge is limited to its training data cutoff and does not reflect the latest medical guidelines or drug approvals.

🤝 Acknowledgements

L1 is a collaborative effort by the following consortium members:

Industry

Lunit
Trillion Labs
SK Biopharmaceuticals
Kakao Healthcare
AIGEN Sciences
D-Circle
Rebellions
Standigm

Academia

Prof. Choi Yun-jae's Lab from KAIST
Prof. Hong Seung-hoon's Lab from KAIST
Prof. Jung Yu-seong's Lab from SNU
Prof. Kim Hyun-woo's Lab from KAIST
Prof. Kim Tae-gyun's Lab from KAIST
Prof. Ye Jong-cheol's Lab from KAIST

Hospitals

NHIS Ilsan Hospital
Ewha Womans University Seoul Hospital
Keimyung University Dongsan Medical Center
Konyang University Hospital
Korea University Research & Business Foundation
Kyung Hee University Hospital at Gangdong
Kyung Hee University Medical Center
Pusan National University Yangsan Hospital
Yongin Severance Hospital

Consortium Members

📄 License

This model is licensed under the Apache 2.0 License.

📬 Contact

Taesoo Kim (김태수) — taesoo.kim@lunit.io
Donggeun Yoo (유동근) — dgyoo@lunit.io

Downloads last month: 27

Safetensors

Model size

16B params

Tensor type

BF16

Model tree for learning-unit/L1-16B-A3B

Base model

trillionlabs/Gravity-16B-A3B-Base

Finetuned

(1)

this model