jeffkang-lunit commited on
Commit
2d5fea2
·
verified ·
1 Parent(s): 3e93565

Delete README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +0 -69
README.md DELETED
@@ -1,69 +0,0 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- tags:
6
- - medical
7
- - clinical
8
- - moe
9
- - mixture-of-experts
10
- - gravity-moe
11
- - sft
12
- library_name: transformers
13
- pipeline_tag: text-generation
14
- ---
15
-
16
- # L1
17
-
18
- L1 is a clinical language model built on the **GravityMoE** (Mixture-of-Experts) architecture, fine-tuned for medical and clinical decision support tasks.
19
-
20
- ## Model Details
21
-
22
- | Property | Value |
23
- |---|---|
24
- | Architecture | GravityMoE (Mixture-of-Experts) |
25
- | Total Parameters | ~16B |
26
- | Active Parameters | ~4.5B per token |
27
- | Routed Experts | 64 |
28
- | Shared Experts | 1 |
29
- | Experts per Token | 8 |
30
- | Hidden Size | 2048 |
31
- | Layers | 28 |
32
- | Attention Heads | 16 |
33
- | KV LoRA Rank | 512 |
34
- | Max Context Length | 32,768 tokens |
35
- | Precision | bfloat16 |
36
- | Vocab Size | 151,552 |
37
-
38
- ## Usage
39
-
40
- ```python
41
- from transformers import AutoModelForCausalLM, AutoTokenizer
42
-
43
- model_name = "learning-unit/L1"
44
- tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
45
- model = AutoModelForCausalLM.from_pretrained(
46
- model_name,
47
- torch_dtype="auto",
48
- device_map="auto",
49
- trust_remote_code=True,
50
- )
51
-
52
- messages = [
53
- {"role": "user", "content": "What are the diagnostic criteria for sepsis?"}
54
- ]
55
- inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
56
- inputs = inputs.to(model.device)
57
- outputs = model.generate(inputs, max_new_tokens=512)
58
- print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
59
- ```
60
-
61
- ## Training
62
-
63
- - **Method**: Supervised Fine-Tuning (SFT)
64
- - **Epochs**: 3
65
- - **Final Training Loss**: 0.247
66
-
67
- ## License
68
-
69
- Apache 2.0