| --- |
| license: mit |
| language: |
| - en |
| pipeline_tag: image-to-image |
| library_name: pytorch |
|
|
| tags: |
| - medical-imaging |
| - computer-vision |
| - pytorch |
| - pix2pix |
| - image-enhancement |
| - laparoscopy |
| - surgical-smoke-removal |
| - defogging |
| - gan |
| - deep-learning |
| - opencv |
| - healthcare-ai |
|
|
| datasets: |
| - custom |
|
|
| metrics: |
| - psnr |
| - ssim |
| --- |
| |
| # Laparoscopy Image Defogging AI |
|
|
| An AI-powered laparoscopic image enhancement system designed to remove fog, haze, and surgical smoke from minimally invasive surgical imagery using deep learning and image restoration techniques. |
|
|
| This repository contains pretrained Pix2Pix UNet-256 generator weights for real-time laparoscopic image defogging and enhancement. |
|
|
| --- |
|
|
| # Model Details |
|
|
| ## Model Description |
|
|
| This model is designed to improve the visual clarity of laparoscopic surgical images by removing: |
| - Lens fogging |
| - Surgical smoke |
| - Haze |
| - Low contrast artifacts |
|
|
| The system combines: |
| - Pix2Pix GAN image translation |
| - Dark Channel Prior (DCP) |
| - Guided filtering |
| - CLAHE enhancement |
| - Contrast restoration |
| - Sharpening and post-processing |
|
|
| The model aims to enhance visibility in minimally invasive surgical environments for research and educational applications. |
|
|
| --- |
|
|
| - **Developed by:** Vishnu Das |
| - **Model type:** Pix2Pix GAN / UNet-256 Generator |
| - **Framework:** PyTorch |
| - **Language(s):** English |
| - **License:** MIT |
| - **Task:** Image-to-Image Translation / Medical Image Enhancement |
|
|
| --- |
|
|
| # Model Sources |
|
|
| - **Repository:** https://github.com/YOUR_GITHUB_USERNAME/YOUR_REPOSITORY_NAME |
| - **Model Repository:** https://huggingface.co/vishnudaspk/Laparoscopy-Image-Defogging-AI |
|
|
| --- |
|
|
| # Uses |
|
|
| ## Direct Use |
|
|
| This model can be used for: |
| - Laparoscopic image enhancement |
| - Surgical smoke removal |
| - Fog removal |
| - Medical imaging research |
| - Computer vision experimentation |
| - Deep learning demonstrations |
|
|
| --- |
|
|
| ## Downstream Use |
|
|
| Possible downstream applications: |
| - Real-time surgical visualization systems |
| - AI-assisted medical imaging pipelines |
| - Surgical simulation environments |
| - Medical video enhancement workflows |
|
|
| --- |
|
|
| ## Out-of-Scope Use |
|
|
| This model is NOT intended for: |
| - Clinical diagnosis |
| - Real surgical deployment |
| - Medical decision-making |
| - Autonomous healthcare systems |
|
|
| Outputs should always be reviewed by qualified professionals. |
|
|
| --- |
|
|
| # Bias, Risks, and Limitations |
|
|
| - Performance depends heavily on image quality and training distribution. |
| - The model may produce artifacts under severe smoke or lighting conditions. |
| - Results may vary across different laparoscopic devices and environments. |
| - This system is intended for research and educational purposes only. |
|
|
| --- |
|
|
| # Recommendations |
|
|
| Users should: |
| - Validate outputs before use |
| - Avoid clinical reliance |
| - Test across multiple datasets |
| - Use CUDA-enabled GPUs for best performance |
|
|
| --- |
|
|
| # How to Get Started with the Model |
|
|
| ## Installation |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| Recommended: |
|
|
| * NVIDIA GPU |
| * CUDA-enabled PyTorch |
|
|
| --- |
|
|
| ## Place Model Weights |
|
|
| Place: |
|
|
| ```text |
| best_net_G.pth |
| ``` |
|
|
| inside: |
|
|
| ```text |
| scripts/checkpoints/pix2pix_laparoscopy_dc/ |
| ``` |
|
|
| --- |
|
|
| ## Run Application |
|
|
| ```bash |
| python app.py |
| ``` |
|
|
| Open: |
|
|
| ```text |
| http://127.0.0.1:5000 |
| ``` |
|
|
| --- |
|
|
| # Training Details |
|
|
| ## Training Data |
|
|
| The model was trained on custom laparoscopic imagery containing varying levels of: |
|
|
| * Surgical smoke |
| * Fogging |
| * Low visibility |
| * Illumination artifacts |
|
|
| Data preprocessing included: |
|
|
| * Resizing |
| * Contrast normalization |
| * Paired image generation |
|
|
| --- |
|
|
| ## Training Procedure |
|
|
| ### Preprocessing |
|
|
| * Image normalization |
| * CLAHE enhancement |
| * Resizing |
| * Data augmentation |
|
|
| --- |
|
|
| ### Training Hyperparameters |
|
|
| * **Architecture:** Pix2Pix UNet-256 |
| * **Framework:** PyTorch |
| * **Training regime:** Mixed precision CUDA training |
| * **Loss Functions:** GAN Loss + L1 Loss |
|
|
| --- |
|
|
| # Evaluation |
|
|
| ## Testing Data, Factors & Metrics |
|
|
| ### Testing Data |
|
|
| Custom laparoscopic test imagery. |
|
|
| --- |
|
|
| ### Metrics |
|
|
| Evaluation metrics include: |
|
|
| * PSNR |
| * SSIM |
| * Visual perceptual quality |
|
|
| --- |
|
|
| # Results |
|
|
| The model demonstrated: |
|
|
| * Improved image clarity |
| * Reduced haze and smoke artifacts |
| * Enhanced contrast and edge visibility |
|
|
| --- |
|
|
| # Environmental Impact |
|
|
| Training performed on: |
|
|
| * **Hardware Type:** NVIDIA RTX 4060 GPU |
| * **Framework:** PyTorch CUDA |
|
|
| --- |
|
|
| # Technical Specifications |
|
|
| ## Model Architecture and Objective |
|
|
| * Pix2Pix GAN |
| * UNet-256 Generator |
| * Image-to-image translation objective |
|
|
| --- |
|
|
| ## Compute Infrastructure |
|
|
| ### Hardware |
|
|
| * NVIDIA RTX 4060 Laptop GPU |
|
|
| ### Software |
|
|
| * Python |
| * PyTorch |
| * OpenCV |
| * NumPy |
| * Flask |
|
|
| --- |
|
|
| # Citation |
|
|
| If you use this project in research or educational work, please cite the repository appropriately. |
|
|
| --- |
|
|
| # More Information |
|
|
| This project was developed as a deep learning and medical imaging research initiative focused on improving surgical visualization quality using AI-powered enhancement techniques. |
|
|
| --- |
|
|
| # Model Card Authors |
|
|
| Vishnu Das |
|
|
| --- |
|
|
| # Model Card Contact |
|
|
| For questions or collaboration: |
|
|
| * GitHub: [https://github.com/YOUR_GITHUB_USERNAME](https://github.com/YOUR_GITHUB_USERNAME) |