L1

Learning Unit 1

L1 (Learning Unit 1) is the first language model from Lunit and Lunit Consortium, purpose-built for the medical domain. Derived from Gravity-16B-A3B-Base, L1 is designed for clinical reasoning and decision support.

✨ Key Highlights

  • 🩺 Medical-Domain Specialized: Developed specifically for clinical reasoning and medical decision support
  • ⚑ Efficient MoE: Only 3B parameters active per token out of 16.24B total β€” fast inference with high capacity
  • πŸ’­ Thinking Model: Performs step-by-step reasoning in <think> tags before generating the final answer

Note: L1 reasons internally using <think>...</think> blocks before producing a response. This chain-of-thought process improves answer quality but consumes additional tokens. Set max_tokens accordingly (recommended: 2048+).

πŸ“‹ Model Specifications

  • Type: Causal Language Model
  • Base Model: Gravity-16B-A3B-Base from Trillion Labs and Lunit Consortium
  • Architecture: GravityMoE (Sparse Mixture-of-Experts with MLA)
  • Total Parameters: 16.24B
  • Active Parameters: 3B
  • Number of Layers: 28
  • Attention Heads: 16
  • KV Heads: 16
  • Hidden Size: 2048
  • MoE Intermediate Size: 1408
  • Routed Experts: 64 (top-8 selection)
  • Shared Experts: 1
  • Context Length: 32,768 tokens
  • Vocabulary Size: 151,552
  • Tokenizer: GLM-4.5
  • Precision: bf16

πŸš€ Quickstart

SGLang (Recommended)

Install:

pip install "sglang[all] @ git+https://github.com/trillion-labs/sglang-gravity.git#subdirectory=python"

Launch server:

python -m sglang.launch_server \
  --model-path learning-unit/L1-16B-A3B \
  --port 9006 --host 0.0.0.0 \
  --tp 1 --dtype bfloat16 --trust-remote-code \
  --attention-backend triton \
  --moe-runner-backend triton

Query:

curl -X POST http://localhost:9006/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "learning-unit/L1-16B-A3B",
    "messages": [
      {"role": "user", "content": "What are the diagnostic criteria for sepsis?"}
    ],
    "max_tokens": 2048
  }'

Transformers

Install:

pip install "transformers>=5.0" torch
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "learning-unit/L1-16B-A3B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

messages = [
    {"role": "user", "content": "What are the diagnostic criteria for sepsis?"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048,
    temperature=0.7,
    do_sample=True,
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

πŸ’¬ Examples

L1 is specialized for the medical domain and covers a wide range of clinical scenarios. Below are representative examples from real-world clinical use cases.

Medical Q&A

A 45-year-old woman with lupus nephritis on mycophenolate and prednisone develops fever, dry cough, and bilateral ground-glass opacities on chest CT. Her CD4 count is 180. What is your differential diagnosis and recommended workup?

Patient Education

I have diabetes and use insulin daily. What is the proper way to store insulin at home?

Clinical Documentation

Please draft an overnight progress note. Patient labs: RBC 4.5, WBC 8. Vitals: HR 82, BP 118/76, RR 15, Temp 37.1. Nurse reports stable overnight. Plan: continue antibiotics, recheck labs in the morning.

Emergency Triage

λ‹€μŒ 응급싀 ν™˜μžμ— λŒ€ν•΄ KTAS triageλ₯Ό μˆ˜ν–‰ν•˜κ³ , 초기 진단 및 감별진단을 μ œμ‹œν•΄μ£Όμ„Έμš”. 78μ„Έ μ—¬μ„± ν™˜μžκ°€ 119 κ΅¬κΈ‰μ°¨λ‘œ 응급싀에 λ‚΄μ›ν–ˆμŠ΅λ‹ˆλ‹€. 22μ‹œκ²½ κ°‘μžκΈ° 쒌츑 μ•ˆλ©΄μ΄ μ²˜μ§€κ³  말이 μ–΄λˆŒν•΄μ§€λŠ” 증상이 λ°œμƒν–ˆμŠ΅λ‹ˆλ‹€. 두톡을 ν˜Έμ†Œν•˜λ©°, κ³ ν˜ˆμ•• 병λ ₯이 μžˆμŠ΅λ‹ˆλ‹€. ν™œλ ₯μ§•ν›„λŠ” ν˜ˆμ•• 172/88, μ‹¬λ°•μˆ˜ 92, 호흑수 14, 체온 36.8, μ‚°μ†Œν¬ν™”λ„ 98%이고 μ˜μ‹μ€ λͺ…λ£Œν•©λ‹ˆλ‹€. 사지 μœ„μ•½κ°μ€ μ—†μŠ΅λ‹ˆλ‹€.

Adverse Drug Reaction (ADR) Causality Assessment

λ‹€μŒ ν™˜μžμ˜ μ•½λ¬Όμ΄μƒλ°˜μ‘(ADR)에 λŒ€ν•΄ WHO-UMC κΈ°μ€€μœΌλ‘œ 인과관계λ₯Ό ν‰κ°€ν•΄μ£Όμ„Έμš”. 80μ„Έ μ—¬μ„± ν™˜μžκ°€ κΈ°κ΄€μ§€ν™•μž₯증으둜 μž…μ› 쀑 moxifloxacin 400mg IVλ₯Ό νˆ¬μ—¬λ°›μ•˜μŠ΅λ‹ˆλ‹€. νˆ¬μ—¬ 쀑 μ „μ‹  ν”ΌλΆ€ 가렀움이 μƒˆλ‘œ λ°œμƒν–ˆκ³ , μ•½λ¬Ό 쀑단 ν›„ ν™˜μž 본인도 가렀움이 μ€„μ–΄λ“œλŠ” 양상을 ν‘œν˜„ν–ˆμœΌλ©° 이후 νšŒλ³΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€. μž¬νˆ¬μ—¬λŠ” μ‹œν–‰ν•˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. κΈ°μ‘΄ μ•½λ¬Ό μ•Œλ ˆλ₯΄κΈ°λ ₯은 μ—†κ³ , 가렀움을 μœ λ°œν•  λ§Œν•œ λ‹€λ₯Έ λ³‘μš©μ•½λ¬Όμ΄λ‚˜ ν”ΌλΆ€μ§ˆν™˜μ€ ν™•μΈλ˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€.

πŸ“Š Benchmark

All benchmarks were evaluated using CoEval, Lunit's open-source medical LLM evaluation framework. Evaluations use greedy decoding (temperature=0). To reproduce these results:

git clone https://github.com/lunit-io/CoEval.git
cd CoEval

Refer to the CoEval Quickstart for setup and evaluation instructions.

MCQA Benchmarks

Model PubMedQA AttrBench MedQA CareQA HeadQA MedMCQA MMLU-Pro (Health) M-ARC MetaMedQA MedHallu MedCalc MedBullets 4-opt MedBullets 5-opt MedXpertQA-R MedXpertQA-U W.Avg
GPT-OSS-120B 78.00 76.10 91.10 91.00 88.40 74.80 74.60 40.00 76.50 83.50 30.30 84.70 82.10 35.60 32.90 79.43
GPT-OSS-20B 75.80 74.80 83.90 84.80 83.30 65.40 70.50 31.00 70.10 81.30 29.20 73.40 70.50 24.70 21.20 73.38
Qwen3.5-122B 76.40 55.68 87.80 86.40 84.00 74.40 73.00 59.00 73.90 37.50 53.70 79.20 79.50 35.90 35.30 75.08
MedGemma-27B 73.40 74.80 84.40 85.00 83.80 71.90 73.00 48.00 69.60 81.40 24.10 73.70 68.80 19.10 20.50 73.99
Gemma4-26B-A4B 76.40 72.00 81.80 84.50 82.30 67.30 73.50 67.00 71.50 86.50 45.60 73.70 67.50 45.10 39.20 75.34
L1-16B-A3B 84.20 78.40 85.50 88.20 85.80 76.70 74.90 82.00 73.10 76.10 43.90 78.90 70.80 27.50 29.20 77.74

Chat Task

Model HealthBench-Consensus
GPT-OSS-120B 90.60
GPT-OSS-20B 78.70
Qwen3.5-122B 92.20
MedGemma-27B 90.70
Gemma4-26B-A4B 92.60
L1-16B-A3B 93.50

πŸ“ Citation

@misc{lunit2026l1,
  title={L1: The First Clinical Language Model by Lunit},
  author={Lunit},
  year={2026},
  url={https://huggingface.co/learning-unit/L1-16B-A3B}
}

⚠️ Limitations

  • Not a substitute for professional medical judgment. L1 may generate factually incorrect, incomplete, or outdated clinical information. All outputs should be verified by qualified healthcare professionals.
  • Thinking overhead. Chain-of-thought reasoning in <think> tags increases token consumption and latency compared to non-thinking models of similar size.
  • Context length. Maximum context length is 32,768 tokens.
  • No real-time knowledge. The model's knowledge is limited to its training data cutoff and does not reflect the latest medical guidelines or drug approvals.

🀝 Acknowledgements

L1 is a collaborative effort by the following consortium members:

Industry

  • Lunit
  • Trillion Labs
  • SK Biopharmaceuticals
  • Kakao Healthcare
  • AIGEN Sciences
  • D-Circle
  • Rebellions
  • Standigm

Academia

  • Prof. Choi Yun-jae's Lab from KAIST
  • Prof. Hong Seung-hoon's Lab from KAIST
  • Prof. Jung Yu-seong's Lab from SNU
  • Prof. Kim Hyun-woo's Lab from KAIST
  • Prof. Kim Tae-gyun's Lab from KAIST
  • Prof. Ye Jong-cheol's Lab from KAIST

Hospitals

  • NHIS Ilsan Hospital
  • Ewha Womans University Seoul Hospital
  • Keimyung University Dongsan Medical Center
  • Konyang University Hospital
  • Korea University Research & Business Foundation
  • Kyung Hee University Hospital at Gangdong
  • Kyung Hee University Medical Center
  • Pusan National University Yangsan Hospital
  • Yongin Severance Hospital

Consortium Members

πŸ“„ License

This model is licensed under the Apache 2.0 License.

πŸ“¬ Contact

Downloads last month
27
Safetensors
Model size
16B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for learning-unit/L1-16B-A3B

Finetuned
(1)
this model