Spaces:

pskeshu
/

anton-microscopy

Sleeping

App Files Files Community

anton-microscopy / CLAUDE.md

pskeshu's picture

Improve mobile responsiveness and fix random button

ac00bb1 9 months ago

|

history blame contribute delete

2.52 kB

Anton Microscopy Project Memory

Current Development Priorities

1. Standardize Experimental Context for VLM

Priority: HIGH

Need consistent, structured biological context across all VLM interactions
Current implementation uses ad-hoc context dictionaries in app.py:305-322
Goal: Create standardized context schema that VLM can reliably interpret
Location: /anton/core/pipeline.py and /anton/vlm/interface.py

2. Implement Qualitative Parallel Pipeline

Priority: HIGH

Current pipeline runs stages sequentially
Need parallel processing for qualitative analysis components
Target: /anton/analysis/qualitative.py and /anton/core/pipeline.py
Benefits: Faster analysis, better resource utilization

3. Revise VQA/Prompts for 4 Stages

Priority: MEDIUM-HIGH

Complete overhaul of all 4-stage prompts needed
Current prompts: /prompts/stage1_global.txt, stage2_objects.txt, stage3_features.txt, stage4_population.txt
Need more specific, microscopy-focused questioning
Improve consistency and biological relevance

4. Overhaul CMPO Mapping Strategy

Priority: MEDIUM-HIGH

Current CMPO mapping in /anton/cmpo/mapping.py needs complete rethink
Issues: Basic keyword matching, low confidence scores
Goal: Semantic understanding, context-aware phenotype classification
Consider: LLM-based mapping, embedding similarity, hierarchical classification

VLM Integration Options

Current Implementation

Uses Google Gemini 1.5 Flash via google-generativeai library
Authentication: GOOGLE_API_KEY environment variable
Location: /anton/vlm/interface.py:79-91
Supports multimodal microscopy image analysis

Hugging Face VLM Alternatives

Open Source Models (free, run locally):

microsoft/kosmos-2-patch14-224 - Good for object detection
Salesforce/blip2-opt-2.7b - Image captioning and QA
llava-hf/llava-1.5-7b-hf - Strong multimodal reasoning

Hugging Face Inference API (hosted):

meta-llama/Llama-3.2-11B-Vision-Instruct
microsoft/Phi-3.5-vision-instruct
Qwen/Qwen2-VL-7B-Instruct

Benefits of switching to HF VLMs:

No API costs for local models
Better privacy (data stays local)
More control over model behavior
Can fine-tune for microscopy-specific tasks

Integration Points

Main VLM interface: /anton/vlm/interface.py
Pipeline integration: /anton/core/pipeline.py:23-28
UI configuration: /app.py:57-69
Dependencies: requirements.txt:8 (google-generativeai)