File size: 2,744 Bytes
0a53750
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
language:
- de
- en
license: apache-2.0
tags:
- medical
- loinc
- terminology-mapping
- llama-3
- unsloth
base_model: unsloth/Llama-3.2-3B-Instruct
datasets:
- custom-loinc-dataset
metrics:
- accuracy
library_name: transformers
pipeline_tag: text-generation
---

# LOINC Medical Terminology Mapper

Fine-tuned Llama-3.2-3B model for mapping German medical terms to LOINC codes using Chain-of-Thought reasoning.

## Model Details

- **Base Model:** unsloth/Llama-3.2-3B-Instruct
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Training Framework:** Unsloth + Hugging Face Transformers
- **Language:** German (primary), English (secondary)
- **Task:** Medical terminology to LOINC code mapping

## Performance

- **Accuracy:** 0.00%
- **Total Samples:** 0
- **Correct Predictions:** 0

## Training Configuration

- **LoRA Rank:** 64
- **LoRA Alpha:** 128
- **Learning Rate:** 0.0002
- **Batch Size:** 64
- **Epochs:** 1
- **Precision:** BF16

## Usage

```python
from unsloth import FastLanguageModel

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Franc105/loinc-mapper",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Format input
messages = [
    {"role": "system", "content": "Du bist ein Experte für medizinische Terminologie und LOINC-Mapping."},
    {"role": "user", "content": "Begriff: Glukose\nEinheit: mg/dL\nBeschreibung: Blutzucker"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to("cuda")

# Generate
outputs = model.generate(
    input_ids=inputs,
    max_new_tokens=512,
    temperature=0.1,
    top_p=0.9,
    do_sample=True
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Intended Use

This model is designed for:
- Mapping German medical terminology to standardized LOINC codes
- Supporting clinical documentation systems
- Assisting healthcare professionals with terminology standardization

## Limitations

- Primarily trained on German medical terms
- Requires structured input format (Begriff, Einheit, Beschreibung)
- May not cover all edge cases in medical terminology

## Training Data

- Custom dataset of German LOINC mappings
- Augmented with synonyms from RELATEDNAMES2
- Chain-of-Thought reasoning examples

## Citation

If you use this model, please cite:

```bibtex
@misc{loinc-mapper-2024,
  title={LOINC Medical Terminology Mapper},
  author={IMESO IT GmbH},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{Franc105/loinc-mapper}}
}
```

## License

Apache 2.0

## Contact

For questions or issues, please open an issue on the model repository.