|
|
--- |
|
|
language: |
|
|
- de |
|
|
- en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- medical |
|
|
- loinc |
|
|
- terminology-mapping |
|
|
- llama-3 |
|
|
- unsloth |
|
|
base_model: unsloth/Llama-3.2-3B-Instruct |
|
|
datasets: |
|
|
- custom-loinc-dataset |
|
|
metrics: |
|
|
- accuracy |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# LOINC Medical Terminology Mapper |
|
|
|
|
|
Fine-tuned Llama-3.2-3B model for mapping German medical terms to LOINC codes using Chain-of-Thought reasoning. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model:** unsloth/Llama-3.2-3B-Instruct |
|
|
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation) |
|
|
- **Training Framework:** Unsloth + Hugging Face Transformers |
|
|
- **Language:** German (primary), English (secondary) |
|
|
- **Task:** Medical terminology to LOINC code mapping |
|
|
|
|
|
## Performance |
|
|
|
|
|
- **Accuracy:** 0.00% |
|
|
- **Total Samples:** 0 |
|
|
- **Correct Predictions:** 0 |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
- **LoRA Rank:** 64 |
|
|
- **LoRA Alpha:** 128 |
|
|
- **Learning Rate:** 0.0002 |
|
|
- **Batch Size:** 64 |
|
|
- **Epochs:** 1 |
|
|
- **Precision:** BF16 |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from unsloth import FastLanguageModel |
|
|
|
|
|
# Load model |
|
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
|
model_name="Franc105/loinc-mapper", |
|
|
max_seq_length=2048, |
|
|
dtype=None, |
|
|
load_in_4bit=True, |
|
|
) |
|
|
|
|
|
# Enable inference mode |
|
|
FastLanguageModel.for_inference(model) |
|
|
|
|
|
# Format input |
|
|
messages = [ |
|
|
{"role": "system", "content": "Du bist ein Experte für medizinische Terminologie und LOINC-Mapping."}, |
|
|
{"role": "user", "content": "Begriff: Glukose\nEinheit: mg/dL\nBeschreibung: Blutzucker"} |
|
|
] |
|
|
|
|
|
inputs = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tokenize=True, |
|
|
add_generation_prompt=True, |
|
|
return_tensors="pt" |
|
|
).to("cuda") |
|
|
|
|
|
# Generate |
|
|
outputs = model.generate( |
|
|
input_ids=inputs, |
|
|
max_new_tokens=512, |
|
|
temperature=0.1, |
|
|
top_p=0.9, |
|
|
do_sample=True |
|
|
) |
|
|
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is designed for: |
|
|
- Mapping German medical terminology to standardized LOINC codes |
|
|
- Supporting clinical documentation systems |
|
|
- Assisting healthcare professionals with terminology standardization |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Primarily trained on German medical terms |
|
|
- Requires structured input format (Begriff, Einheit, Beschreibung) |
|
|
- May not cover all edge cases in medical terminology |
|
|
|
|
|
## Training Data |
|
|
|
|
|
- Custom dataset of German LOINC mappings |
|
|
- Augmented with synonyms from RELATEDNAMES2 |
|
|
- Chain-of-Thought reasoning examples |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{loinc-mapper-2024, |
|
|
title={LOINC Medical Terminology Mapper}, |
|
|
author={IMESO IT GmbH}, |
|
|
year={2024}, |
|
|
publisher={Hugging Face}, |
|
|
howpublished={\url{Franc105/loinc-mapper}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 |
|
|
|
|
|
## Contact |
|
|
|
|
|
For questions or issues, please open an issue on the model repository. |
|
|
|