--- license: mit base_model: - microsoft/Phi-3-mini-4k-instruct tags: - Medical - MedicalCoding - Pharma --- # Medical Coding LLM Predict ICD-10 and CPT codes from clinical notes using a fine-tuned LLM. This model is fine-tuned on clinical notes using Phi-3-mini with LoRA and 4-bit quantization. It can generate both ICD/CPT codes and short explanations, helping automate the medical coding process. ## Model Details Base Model: microsoft/Phi-3-mini-4k-instruct Fine-Tuning: LoRA (r=16, alpha=32, dropout=0.05) Quantization: 4-bit (BitsAndBytes NF4) Training Dataset: Custom dataset of clinical notes, ICD codes, and supporting evidence Task: Causal Language Modeling for code prediction ## Usage # from transformers import AutoTokenizer, AutoModelForCausalLM import torch, re # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained("Kavyaah/medical-coding-llm") model = AutoModelForCausalLM.from_pretrained("Kavyaah/medical-coding-llm") model.eval() # Function to predict ICD/CPT codes def get_code(statement, max_new_tokens=50): prompt = f"Assign the correct ICD or CPT medical code for this case:\n{statement}\nCode:" inputs = tokenizer(prompt, return_tensors="pt") with torch.no_grad(): outputs = model.generate(**inputs, max_new_tokens=max_new_tokens, do_sample=False) result = tokenizer.decode(outputs[0], skip_special_tokens=True) # Extract code using regex if "Code:" in result: result = result.split("Code:")[-1] match = re.search(r"\b[A-Z]\d{1,3}\.?[A-Z0-9]*\b", result) return match.group(0).strip() if match else result.strip() # Example statement = "Patient diagnosed with Type 2 diabetes mellitus without complications." print(get_code(statement)) # Output: E11.9 ## Evaluation Exact match accuracy: 25% Semantic accuracy (ICD block match): 50% ## Intended Use Assisting medical coders and healthcare professionals. Automating initial code suggestions from clinical notes. ## Limitations Trained on a small dataset; may not cover all ICD/CPT codes. Use as an assistive tool, not a replacement for professional judgment. Always review predicted codes before clinical or billing use. ## License MIT License — feel free to use and adapt for non-commercial purposes.