--- license: apache-2.0 language: - en base_model: - trillionlabs/Gravity-16B-A3B-Base tags: - medical - clinical - mixture-of-experts - conversational - sft library_name: transformers pipeline_tag: text-generation ---

L1

# Learning Unit 1 **L1** (Learning Unit 1) is the first language model from [Lunit](https://www.lunit.io) and Lunit Consortium, purpose-built for the medical domain. Derived from [Gravity-16B-A3B-Base](https://huggingface.co/trillionlabs/Gravity-16B-A3B-Base), L1 is designed for clinical reasoning and decision support. ### ✨ Key Highlights * 🩺 **Medical-Domain Specialized**: Developed specifically for clinical reasoning and medical decision support * ⚑ **Efficient MoE**: Only 3B parameters active per token out of 16.24B total β€” fast inference with high capacity * πŸ’­ **Thinking Model**: Performs step-by-step reasoning in `` tags before generating the final answer > **Note:** L1 reasons internally using `...` blocks before producing a response. This chain-of-thought process improves answer quality but consumes additional tokens. Set `max_tokens` accordingly (recommended: 2048+). ### πŸ“‹ Model Specifications - Type: Causal Language Model - Base Model: [Gravity-16B-A3B-Base](https://huggingface.co/trillionlabs/Gravity-16B-A3B-Base) from Trillion Labs and Lunit Consortium - Architecture: GravityMoE (Sparse Mixture-of-Experts with MLA) - Total Parameters: 16.24B - Active Parameters: 3B - Number of Layers: 28 - Attention Heads: 16 - KV Heads: 16 - Hidden Size: 2048 - MoE Intermediate Size: 1408 - Routed Experts: 64 (top-8 selection) - Shared Experts: 1 - Context Length: 32,768 tokens - Vocabulary Size: 151,552 - Tokenizer: GLM-4.5 - Precision: bf16 ## πŸš€ Quickstart ### SGLang (Recommended) **Install:** ```bash pip install "sglang[all] @ git+https://github.com/trillion-labs/sglang-gravity.git#subdirectory=python" ``` **Launch server:** ```bash python -m sglang.launch_server \ --model-path learning-unit/L1-16B-A3B \ --port 9006 --host 0.0.0.0 \ --tp 1 --dtype bfloat16 --trust-remote-code \ --attention-backend triton \ --moe-runner-backend triton ``` **Query:** ```bash curl -X POST http://localhost:9006/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "learning-unit/L1-16B-A3B", "messages": [ {"role": "user", "content": "What are the diagnostic criteria for sepsis?"} ], "max_tokens": 2048 }' ``` ### Transformers **Install:** ```bash pip install "transformers>=5.0" torch ``` ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "learning-unit/L1-16B-A3B" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) messages = [ {"role": "user", "content": "What are the diagnostic criteria for sepsis?"} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=2048, temperature=0.7, do_sample=True, ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ``` ## πŸ’¬ Examples L1 is specialized for the medical domain and covers a wide range of clinical scenarios. Below are representative examples from real-world clinical use cases. ### Medical Q&A > A 45-year-old woman with lupus nephritis on mycophenolate and prednisone develops fever, dry cough, and bilateral ground-glass opacities on chest CT. Her CD4 count is 180. What is your differential diagnosis and recommended workup? ### Patient Education > I have diabetes and use insulin daily. What is the proper way to store insulin at home? ### Clinical Documentation > Please draft an overnight progress note. Patient labs: RBC 4.5, WBC 8. Vitals: HR 82, BP 118/76, RR 15, Temp 37.1. Nurse reports stable overnight. Plan: continue antibiotics, recheck labs in the morning. ### Emergency Triage > λ‹€μŒ 응급싀 ν™˜μžμ— λŒ€ν•΄ KTAS triageλ₯Ό μˆ˜ν–‰ν•˜κ³ , 초기 진단 및 감별진단을 μ œμ‹œν•΄μ£Όμ„Έμš”. 78μ„Έ μ—¬μ„± ν™˜μžκ°€ 119 κ΅¬κΈ‰μ°¨λ‘œ 응급싀에 λ‚΄μ›ν–ˆμŠ΅λ‹ˆλ‹€. 22μ‹œκ²½ κ°‘μžκΈ° 쒌츑 μ•ˆλ©΄μ΄ μ²˜μ§€κ³  말이 μ–΄λˆŒν•΄μ§€λŠ” 증상이 λ°œμƒν–ˆμŠ΅λ‹ˆλ‹€. 두톡을 ν˜Έμ†Œν•˜λ©°, κ³ ν˜ˆμ•• 병λ ₯이 μžˆμŠ΅λ‹ˆλ‹€. ν™œλ ₯μ§•ν›„λŠ” ν˜ˆμ•• 172/88, μ‹¬λ°•μˆ˜ 92, 호흑수 14, 체온 36.8, μ‚°μ†Œν¬ν™”λ„ 98%이고 μ˜μ‹μ€ λͺ…λ£Œν•©λ‹ˆλ‹€. 사지 μœ„μ•½κ°μ€ μ—†μŠ΅λ‹ˆλ‹€. ### Adverse Drug Reaction (ADR) Causality Assessment > λ‹€μŒ ν™˜μžμ˜ μ•½λ¬Όμ΄μƒλ°˜μ‘(ADR)에 λŒ€ν•΄ WHO-UMC κΈ°μ€€μœΌλ‘œ 인과관계λ₯Ό ν‰κ°€ν•΄μ£Όμ„Έμš”. 80μ„Έ μ—¬μ„± ν™˜μžκ°€ κΈ°κ΄€μ§€ν™•μž₯증으둜 μž…μ› 쀑 moxifloxacin 400mg IVλ₯Ό νˆ¬μ—¬λ°›μ•˜μŠ΅λ‹ˆλ‹€. νˆ¬μ—¬ 쀑 μ „μ‹  ν”ΌλΆ€ 가렀움이 μƒˆλ‘œ λ°œμƒν–ˆκ³ , μ•½λ¬Ό 쀑단 ν›„ ν™˜μž 본인도 가렀움이 μ€„μ–΄λ“œλŠ” 양상을 ν‘œν˜„ν–ˆμœΌλ©° 이후 νšŒλ³΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€. μž¬νˆ¬μ—¬λŠ” μ‹œν–‰ν•˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. κΈ°μ‘΄ μ•½λ¬Ό μ•Œλ ˆλ₯΄κΈ°λ ₯은 μ—†κ³ , 가렀움을 μœ λ°œν•  λ§Œν•œ λ‹€λ₯Έ λ³‘μš©μ•½λ¬Όμ΄λ‚˜ ν”ΌλΆ€μ§ˆν™˜μ€ ν™•μΈλ˜μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. ## πŸ“Š Benchmark All benchmarks were evaluated using [CoEval](https://github.com/lunit-io/CoEval), Lunit's open-source medical LLM evaluation framework. Evaluations use greedy decoding (temperature=0). To reproduce these results: ```bash git clone https://github.com/lunit-io/CoEval.git cd CoEval ``` Refer to the [CoEval Quickstart](https://github.com/lunit-io/CoEval#quickstart) for setup and evaluation instructions. ### MCQA Benchmarks | Model | [PubMedQA](https://huggingface.co/datasets/qiaojin/PubMedQA) | [AttrBench](https://huggingface.co/datasets/osunlp/AttributionBench) | [MedQA](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options) | [CareQA](https://huggingface.co/datasets/HPAI-BSC/CareQA) | [HeadQA](https://huggingface.co/datasets/alesi12/head_qa_v2) | [MedMCQA](https://huggingface.co/datasets/lighteval/med_mcqa) | [MMLU-Pro (Health)](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) | [M-ARC](https://huggingface.co/datasets/mkieffer/M-ARC) | [MetaMedQA](https://huggingface.co/datasets/maximegmd/MetaMedQA) | [MedHallu](https://huggingface.co/datasets/UTAustin-AIHealth/MedHallu) | [MedCalc](https://huggingface.co/datasets/ncbi/MedCalc-Bench) | [MedBullets](https://huggingface.co/datasets/mkieffer/Medbullets) 4-opt | [MedBullets](https://huggingface.co/datasets/mkieffer/Medbullets) 5-opt | [MedXpertQA](https://huggingface.co/datasets/TsinghuaC3I/MedXpertQA)-R | [MedXpertQA](https://huggingface.co/datasets/TsinghuaC3I/MedXpertQA)-U | W.Avg | |:---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | GPT-OSS-120B | 78.00 | 76.10 | 91.10 | 91.00 | 88.40 | 74.80 | 74.60 | 40.00 | 76.50 | 83.50 | 30.30 | 84.70 | 82.10 | 35.60 | 32.90 | 79.43 | | GPT-OSS-20B | 75.80 | 74.80 | 83.90 | 84.80 | 83.30 | 65.40 | 70.50 | 31.00 | 70.10 | 81.30 | 29.20 | 73.40 | 70.50 | 24.70 | 21.20 | 73.38 | | Qwen3.5-122B | 76.40 | 55.68 | 87.80 | 86.40 | 84.00 | 74.40 | 73.00 | 59.00 | 73.90 | 37.50 | 53.70 | 79.20 | 79.50 | 35.90 | 35.30 | 75.08 | | MedGemma-27B | 73.40 | 74.80 | 84.40 | 85.00 | 83.80 | 71.90 | 73.00 | 48.00 | 69.60 | 81.40 | 24.10 | 73.70 | 68.80 | 19.10 | 20.50 | 73.99 | | Gemma4-26B-A4B | 76.40 | 72.00 | 81.80 | 84.50 | 82.30 | 67.30 | 73.50 | 67.00 | 71.50 | 86.50 | 45.60 | 73.70 | 67.50 | 45.10 | 39.20 | 75.34 | | L1-16B-A3B | 84.20 | 78.40 | 85.50 | 88.20 | 85.80 | 76.70 | 74.90 | 82.00 | 73.10 | 76.10 | 43.90 | 78.90 | 70.80 | 27.50 | 29.20 | 77.74 | ### Chat Task | Model | [HealthBench-Consensus](https://github.com/openai/simple-evals) | |:---|:---:| | GPT-OSS-120B | 90.60 | | GPT-OSS-20B | 78.70 | | Qwen3.5-122B | 92.20 | | MedGemma-27B | 90.70 | | Gemma4-26B-A4B | 92.60 | | L1-16B-A3B | 93.50 | ## πŸ“ Citation ```bibtex @misc{lunit2026l1, title={L1: The First Clinical Language Model by Lunit}, author={Lunit}, year={2026}, url={https://huggingface.co/learning-unit/L1-16B-A3B} } ``` ## ⚠️ Limitations - **Not a substitute for professional medical judgment.** L1 may generate factually incorrect, incomplete, or outdated clinical information. All outputs should be verified by qualified healthcare professionals. - **Thinking overhead.** Chain-of-thought reasoning in `` tags increases token consumption and latency compared to non-thinking models of similar size. - **Context length.** Maximum context length is 32,768 tokens. - **No real-time knowledge.** The model's knowledge is limited to its training data cutoff and does not reflect the latest medical guidelines or drug approvals. ## 🀝 Acknowledgements This work was supported by the Domain-Specific Foundation Model Project (인곡지λŠ₯ νŠΉν™” νŒŒμš΄λ°μ΄μ…˜ λͺ¨λΈ ν”„λ‘œμ νŠΈ), funded by the Ministry of Science and ICT (κ³Όν•™κΈ°μˆ μ •λ³΄ν†΅μ‹ λΆ€) and managed by the National IT Industry Promotion Agency (NIPA). L1 is a collaborative effort by the following consortium members: **Industry** - Lunit - Trillion Labs - SK Biopharmaceuticals - Kakao Healthcare - AIGEN Sciences - D-Circle - Rebellions - Standigm **Academia** - Prof. Choi Yun-jae's Lab from KAIST - Prof. Hong Seung-hoon's Lab from KAIST - Prof. Jung Yu-seong's Lab from SNU - Prof. Kim Hyun-woo's Lab from KAIST - Prof. Kim Tae-gyun's Lab from KAIST - Prof. Ye Jong-cheol's Lab from KAIST **Hospitals** - NHIS Ilsan Hospital - Ewha Womans University Seoul Hospital - Keimyung University Dongsan Medical Center - Konyang University Hospital - Korea University Research & Business Foundation - Kyung Hee University Hospital at Gangdong - Kyung Hee University Medical Center - Pusan National University Yangsan Hospital - Yongin Severance Hospital

Consortium Members

## πŸ“„ License This model is licensed under the [Apache 2.0 License](LICENSE). ## πŸ“¬ Contact - Taesoo Kim (κΉ€νƒœμˆ˜) β€” [taesoo.kim@lunit.io](mailto:taesoo.kim@lunit.io) - Donggeun Yoo (μœ λ™κ·Ό) β€” [dgyoo@lunit.io](mailto:dgyoo@lunit.io)