Spaces:

pskeshu
/

anton-microscopy

Sleeping

pskeshu Claude commited on Jul 13, 2025

Commit

818797a

1 Parent(s): 9d5f53d

Fix Gemini VLM truncation and improve 4-stage analysis prompts

- Increase max_output_tokens for Gemini from default to 2000 to prevent truncation
- Add proper generation_config with temperature and sampling parameters
- Update all 4 stage prompts with detailed, biological context-aware questions
- Replace technical format requirements with natural language instructions
- Improve prompt clarity for better VLM understanding and responses

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (5) hide show

anton/vlm/interface.py +14 -3
prompts/stage1_global.txt +8 -6
prompts/stage2_objects.txt +9 -7
prompts/stage3_features.txt +10 -14
prompts/stage4_population.txt +10 -10

anton/vlm/interface.py CHANGED Viewed

@@ -298,19 +298,30 @@ class VLMInterface:
             raise
     async def _call_gemini(self, prompt: str, image_data: Optional[str] = None) -> str:
-        """Call Gemini API."""
         try:
             if image_data:
                 # Decode base64 image for Gemini
                 image_bytes = base64.b64decode(image_data)
                 pil_image = Image.open(BytesIO(image_bytes))
                 response = await asyncio.to_thread(
-                    self.client.generate_content, [prompt, pil_image]
                 )
             else:
                 response = await asyncio.to_thread(
-                    self.client.generate_content, prompt
                 )
             return response.text

             raise
     async def _call_gemini(self, prompt: str, image_data: Optional[str] = None) -> str:
+        """Call Gemini API with improved token limits."""
         try:
+            # Import here to avoid issues
+            import google.generativeai as genai
+            # Configure generation parameters for longer responses
+            generation_config = genai.types.GenerationConfig(
+                max_output_tokens=2000,
+                temperature=0.1,
+                top_p=0.8,
+                top_k=20
+            )
             if image_data:
                 # Decode base64 image for Gemini
                 image_bytes = base64.b64decode(image_data)
                 pil_image = Image.open(BytesIO(image_bytes))
                 response = await asyncio.to_thread(
+                    self.client.generate_content, [prompt, pil_image], generation_config=generation_config
                 )
             else:
                 response = await asyncio.to_thread(
+                    self.client.generate_content, prompt, generation_config=generation_config
                 )
             return response.text

prompts/stage1_global.txt CHANGED Viewed

@@ -1,9 +1,11 @@
-Analyze this fluorescence microscopy image for overall scene understanding.
-Provide:
-1. Image quality assessment
-2. Staining type identification
-3. General cellular/tissue characteristics
 4. Recommended analysis approach
 Focus on: [DYNAMIC_CONTEXT]
-Output format: Natural language description + structured assessment

+Analyze this microscopy image and provide a detailed description.
+Please describe:
+1. Overall image quality and clarity
+2. Visible cellular structures
+3. Any notable features or patterns
 4. Recommended analysis approach
 Focus on: [DYNAMIC_CONTEXT]
+Provide your response in a clear, structured format with detailed biological insights.

prompts/stage2_objects.txt CHANGED Viewed

@@ -1,7 +1,9 @@
-Task: Identify major structures and suggest segmentation strategies for this fluorescence microscopy image.
-Input: [image, global context]
-Provide:
-1. List of detected objects/structures
-2. Segmentation guidance (e.g., nuclei, cytoplasm)
-3. Object count estimate
-Output format: Structured JSON with detected_objects, segmentation_guidance, object_count_estimate

+Based on this microscopy image, identify and describe all visible cellular objects and structures.
+Please identify:
+1. Number and types of cells visible
+2. Subcellular structures (nuclei, organelles, etc.)
+3. Cell boundaries and morphology
+4. Any specific features related to the experimental context
+Provide a structured list of detected objects with confidence estimates.

prompts/stage3_features.txt CHANGED Viewed

@@ -1,14 +1,10 @@
-Task: Analyze segmented regions for complex CMPO features, focusing on texture-based patterns.
-Input:
-    - Region patches: [nuclei, cytoplasm patches]
-    - Config: {stain: "{STAIN}", channel: {CHANNEL}, phenotype_focus: "{PHENOTYPE}"}
-    - Target features: ["chromatin_condensation", "LC3_puncta", "nuclear_fragmentation"]
-Analyze:
-    1. Texture patterns (granular, smooth, fragmented)
-    2. Intensity distributions (bright spots, uniform, heterogeneous)
-    3. Morphological features (shape irregularities, size variations)
-Provide:
-    1. Feature descriptions with confidence scores
-    2. CMPO term mappings
-    3. Supporting visual evidence
-Output format: {features: [{name: str, confidence: float, evidence: str, cmpo_id: str}]}

+Analyze the morphological and textural features of the cellular structures in this microscopy image.
+Focus on:
+1. Cell shape characteristics and size variations
+2. Nuclear morphology and chromatin patterns
+3. Cytoplasmic texture and organization
+4. Protein localization patterns (if fluorescent)
+5. Any pathological or experimental features
+Describe specific features with confidence scores and biological significance.

prompts/stage4_population.txt CHANGED Viewed

@@ -1,10 +1,10 @@
-Task: Analyze population-level patterns in this fluorescence microscopy image.
-Input:
-    - Individual cell feature analyses (morphology, intensity, localization patterns)
-    - Experimental context and biological metadata
-Provide:
-    1. Population summary describing overall cellular characteristics and phenotype prevalence
-    2. Quantitative estimates (e.g., percentage of cells showing specific phenotypes)
-    3. Biological interpretation of patterns in experimental context
-    4. CMPO phenotype relevance assessment
-Output format: Natural language biological analysis with structured sections for population summary, quantitative insights, and biological interpretation

+Provide population-level analysis of the cells in this microscopy image.
+Analyze:
+1. Overall cellular health and viability
+2. Population heterogeneity
+3. Distribution of phenotypic characteristics
+4. Experimental readout interpretation
+5. Statistical observations about the cell population
+Provide insights relevant to the experimental goals and biological context.