pskeshu Claude commited on
Commit
818797a
·
1 Parent(s): 9d5f53d

Fix Gemini VLM truncation and improve 4-stage analysis prompts

Browse files

- Increase max_output_tokens for Gemini from default to 2000 to prevent truncation
- Add proper generation_config with temperature and sampling parameters
- Update all 4 stage prompts with detailed, biological context-aware questions
- Replace technical format requirements with natural language instructions
- Improve prompt clarity for better VLM understanding and responses

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

anton/vlm/interface.py CHANGED
@@ -298,19 +298,30 @@ class VLMInterface:
298
  raise
299
 
300
  async def _call_gemini(self, prompt: str, image_data: Optional[str] = None) -> str:
301
- """Call Gemini API."""
302
  try:
 
 
 
 
 
 
 
 
 
 
 
303
  if image_data:
304
  # Decode base64 image for Gemini
305
  image_bytes = base64.b64decode(image_data)
306
  pil_image = Image.open(BytesIO(image_bytes))
307
 
308
  response = await asyncio.to_thread(
309
- self.client.generate_content, [prompt, pil_image]
310
  )
311
  else:
312
  response = await asyncio.to_thread(
313
- self.client.generate_content, prompt
314
  )
315
 
316
  return response.text
 
298
  raise
299
 
300
  async def _call_gemini(self, prompt: str, image_data: Optional[str] = None) -> str:
301
+ """Call Gemini API with improved token limits."""
302
  try:
303
+ # Import here to avoid issues
304
+ import google.generativeai as genai
305
+
306
+ # Configure generation parameters for longer responses
307
+ generation_config = genai.types.GenerationConfig(
308
+ max_output_tokens=2000,
309
+ temperature=0.1,
310
+ top_p=0.8,
311
+ top_k=20
312
+ )
313
+
314
  if image_data:
315
  # Decode base64 image for Gemini
316
  image_bytes = base64.b64decode(image_data)
317
  pil_image = Image.open(BytesIO(image_bytes))
318
 
319
  response = await asyncio.to_thread(
320
+ self.client.generate_content, [prompt, pil_image], generation_config=generation_config
321
  )
322
  else:
323
  response = await asyncio.to_thread(
324
+ self.client.generate_content, prompt, generation_config=generation_config
325
  )
326
 
327
  return response.text
prompts/stage1_global.txt CHANGED
@@ -1,9 +1,11 @@
1
- Analyze this fluorescence microscopy image for overall scene understanding.
2
- Provide:
3
- 1. Image quality assessment
4
- 2. Staining type identification
5
- 3. General cellular/tissue characteristics
 
6
  4. Recommended analysis approach
7
 
8
  Focus on: [DYNAMIC_CONTEXT]
9
- Output format: Natural language description + structured assessment
 
 
1
+ Analyze this microscopy image and provide a detailed description.
2
+
3
+ Please describe:
4
+ 1. Overall image quality and clarity
5
+ 2. Visible cellular structures
6
+ 3. Any notable features or patterns
7
  4. Recommended analysis approach
8
 
9
  Focus on: [DYNAMIC_CONTEXT]
10
+
11
+ Provide your response in a clear, structured format with detailed biological insights.
prompts/stage2_objects.txt CHANGED
@@ -1,7 +1,9 @@
1
- Task: Identify major structures and suggest segmentation strategies for this fluorescence microscopy image.
2
- Input: [image, global context]
3
- Provide:
4
- 1. List of detected objects/structures
5
- 2. Segmentation guidance (e.g., nuclei, cytoplasm)
6
- 3. Object count estimate
7
- Output format: Structured JSON with detected_objects, segmentation_guidance, object_count_estimate
 
 
 
1
+ Based on this microscopy image, identify and describe all visible cellular objects and structures.
2
+
3
+ Please identify:
4
+ 1. Number and types of cells visible
5
+ 2. Subcellular structures (nuclei, organelles, etc.)
6
+ 3. Cell boundaries and morphology
7
+ 4. Any specific features related to the experimental context
8
+
9
+ Provide a structured list of detected objects with confidence estimates.
prompts/stage3_features.txt CHANGED
@@ -1,14 +1,10 @@
1
- Task: Analyze segmented regions for complex CMPO features, focusing on texture-based patterns.
2
- Input:
3
- - Region patches: [nuclei, cytoplasm patches]
4
- - Config: {stain: "{STAIN}", channel: {CHANNEL}, phenotype_focus: "{PHENOTYPE}"}
5
- - Target features: ["chromatin_condensation", "LC3_puncta", "nuclear_fragmentation"]
6
- Analyze:
7
- 1. Texture patterns (granular, smooth, fragmented)
8
- 2. Intensity distributions (bright spots, uniform, heterogeneous)
9
- 3. Morphological features (shape irregularities, size variations)
10
- Provide:
11
- 1. Feature descriptions with confidence scores
12
- 2. CMPO term mappings
13
- 3. Supporting visual evidence
14
- Output format: {features: [{name: str, confidence: float, evidence: str, cmpo_id: str}]}
 
1
+ Analyze the morphological and textural features of the cellular structures in this microscopy image.
2
+
3
+ Focus on:
4
+ 1. Cell shape characteristics and size variations
5
+ 2. Nuclear morphology and chromatin patterns
6
+ 3. Cytoplasmic texture and organization
7
+ 4. Protein localization patterns (if fluorescent)
8
+ 5. Any pathological or experimental features
9
+
10
+ Describe specific features with confidence scores and biological significance.
 
 
 
 
prompts/stage4_population.txt CHANGED
@@ -1,10 +1,10 @@
1
- Task: Analyze population-level patterns in this fluorescence microscopy image.
2
- Input:
3
- - Individual cell feature analyses (morphology, intensity, localization patterns)
4
- - Experimental context and biological metadata
5
- Provide:
6
- 1. Population summary describing overall cellular characteristics and phenotype prevalence
7
- 2. Quantitative estimates (e.g., percentage of cells showing specific phenotypes)
8
- 3. Biological interpretation of patterns in experimental context
9
- 4. CMPO phenotype relevance assessment
10
- Output format: Natural language biological analysis with structured sections for population summary, quantitative insights, and biological interpretation
 
1
+ Provide population-level analysis of the cells in this microscopy image.
2
+
3
+ Analyze:
4
+ 1. Overall cellular health and viability
5
+ 2. Population heterogeneity
6
+ 3. Distribution of phenotypic characteristics
7
+ 4. Experimental readout interpretation
8
+ 5. Statistical observations about the cell population
9
+
10
+ Provide insights relevant to the experimental goals and biological context.