LLaVA Skin Disease Multimodal Assistant

Model Details

Model Description

This model is a parameter-efficient fine-tuned version of LLaVA v1.5 7B designed to generate detailed descriptions and multi-turn conversations about dermatological images.

The model was fine-tuned to improve the ability of vision-language models to describe skin conditions and generate structured medical-style explanations from images. The goal of this project was to explore how multimodal models can be adapted for medical reasoning and structured visual analysis using limited compute resources.

Training focused on improving the model’s ability to:

Describe dermatological patterns in images
Generate structured explanations of possible conditions
Produce multi-turn conversational responses grounded in visual input

This project was developed as an experimental research project exploring efficient multimodal fine-tuning.

Developed by: Abdulmateen Ashifa
Model type: Vision-Language Model (VLM)
Language(s): English
License: Same as base model (LLaVA v1.5 license)
Finetuned from model: liuhaotian/llava-v1.5-7b

Model Sources

Base Model Repository: https://huggingface.co/liuhaotian/llava-v1.5-7b

Uses

Direct Use

The model can be used for research and experimentation with multimodal models that generate descriptions of dermatological images.

Example tasks include:

Visual description of skin lesions
Educational demonstrations of medical image analysis
Multimodal dialogue generation

Downstream Use

The model may be integrated into systems that explore:

AI-assisted dermatology education
Synthetic training data generation
Multimodal medical reasoning research

Out-of-Scope Use

This model should not be used for clinical diagnosis or medical decision making. The model is intended for research and experimentation only.

Bias, Risks, and Limitations

The model inherits limitations from both the base LLaVA architecture and the dataset used for fine-tuning.

Potential risks include:

Incorrect medical descriptions
Hallucinated conditions
Bias from the training dataset
Limited generalization to unseen skin conditions

Because of these limitations, outputs should always be verified by qualified professionals in real-world contexts.

How to Get Started with the Model

Example usage with HuggingFace Transformers and PEFT:

from transformers import AutoModelForCausalLM
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained(
    "liuhaotian/llava-v1.5-7b",
    device_map="auto"
)

model = PeftModel.from_pretrained(base_model, "your-model-path")

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Abdulmateen/llava-finetuned

Base model

liuhaotian/llava-v1.5-7b

Adapter

(67)

this model

Abdulmateen
/

llava-finetuned