File size: 2,744 Bytes
0a53750 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | ---
language:
- de
- en
license: apache-2.0
tags:
- medical
- loinc
- terminology-mapping
- llama-3
- unsloth
base_model: unsloth/Llama-3.2-3B-Instruct
datasets:
- custom-loinc-dataset
metrics:
- accuracy
library_name: transformers
pipeline_tag: text-generation
---
# LOINC Medical Terminology Mapper
Fine-tuned Llama-3.2-3B model for mapping German medical terms to LOINC codes using Chain-of-Thought reasoning.
## Model Details
- **Base Model:** unsloth/Llama-3.2-3B-Instruct
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
- **Training Framework:** Unsloth + Hugging Face Transformers
- **Language:** German (primary), English (secondary)
- **Task:** Medical terminology to LOINC code mapping
## Performance
- **Accuracy:** 0.00%
- **Total Samples:** 0
- **Correct Predictions:** 0
## Training Configuration
- **LoRA Rank:** 64
- **LoRA Alpha:** 128
- **Learning Rate:** 0.0002
- **Batch Size:** 64
- **Epochs:** 1
- **Precision:** BF16
## Usage
```python
from unsloth import FastLanguageModel
# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="Franc105/loinc-mapper",
max_seq_length=2048,
dtype=None,
load_in_4bit=True,
)
# Enable inference mode
FastLanguageModel.for_inference(model)
# Format input
messages = [
{"role": "system", "content": "Du bist ein Experte für medizinische Terminologie und LOINC-Mapping."},
{"role": "user", "content": "Begriff: Glukose\nEinheit: mg/dL\nBeschreibung: Blutzucker"}
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to("cuda")
# Generate
outputs = model.generate(
input_ids=inputs,
max_new_tokens=512,
temperature=0.1,
top_p=0.9,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Intended Use
This model is designed for:
- Mapping German medical terminology to standardized LOINC codes
- Supporting clinical documentation systems
- Assisting healthcare professionals with terminology standardization
## Limitations
- Primarily trained on German medical terms
- Requires structured input format (Begriff, Einheit, Beschreibung)
- May not cover all edge cases in medical terminology
## Training Data
- Custom dataset of German LOINC mappings
- Augmented with synonyms from RELATEDNAMES2
- Chain-of-Thought reasoning examples
## Citation
If you use this model, please cite:
```bibtex
@misc{loinc-mapper-2024,
title={LOINC Medical Terminology Mapper},
author={IMESO IT GmbH},
year={2024},
publisher={Hugging Face},
howpublished={\url{Franc105/loinc-mapper}}
}
```
## License
Apache 2.0
## Contact
For questions or issues, please open an issue on the model repository.
|