Spaces:
Sleeping
Sleeping
Fix Gemini VLM truncation and improve 4-stage analysis prompts
Browse files- Increase max_output_tokens for Gemini from default to 2000 to prevent truncation
- Add proper generation_config with temperature and sampling parameters
- Update all 4 stage prompts with detailed, biological context-aware questions
- Replace technical format requirements with natural language instructions
- Improve prompt clarity for better VLM understanding and responses
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- anton/vlm/interface.py +14 -3
- prompts/stage1_global.txt +8 -6
- prompts/stage2_objects.txt +9 -7
- prompts/stage3_features.txt +10 -14
- prompts/stage4_population.txt +10 -10
anton/vlm/interface.py
CHANGED
|
@@ -298,19 +298,30 @@ class VLMInterface:
|
|
| 298 |
raise
|
| 299 |
|
| 300 |
async def _call_gemini(self, prompt: str, image_data: Optional[str] = None) -> str:
|
| 301 |
-
"""Call Gemini API."""
|
| 302 |
try:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 303 |
if image_data:
|
| 304 |
# Decode base64 image for Gemini
|
| 305 |
image_bytes = base64.b64decode(image_data)
|
| 306 |
pil_image = Image.open(BytesIO(image_bytes))
|
| 307 |
|
| 308 |
response = await asyncio.to_thread(
|
| 309 |
-
self.client.generate_content, [prompt, pil_image]
|
| 310 |
)
|
| 311 |
else:
|
| 312 |
response = await asyncio.to_thread(
|
| 313 |
-
self.client.generate_content, prompt
|
| 314 |
)
|
| 315 |
|
| 316 |
return response.text
|
|
|
|
| 298 |
raise
|
| 299 |
|
| 300 |
async def _call_gemini(self, prompt: str, image_data: Optional[str] = None) -> str:
|
| 301 |
+
"""Call Gemini API with improved token limits."""
|
| 302 |
try:
|
| 303 |
+
# Import here to avoid issues
|
| 304 |
+
import google.generativeai as genai
|
| 305 |
+
|
| 306 |
+
# Configure generation parameters for longer responses
|
| 307 |
+
generation_config = genai.types.GenerationConfig(
|
| 308 |
+
max_output_tokens=2000,
|
| 309 |
+
temperature=0.1,
|
| 310 |
+
top_p=0.8,
|
| 311 |
+
top_k=20
|
| 312 |
+
)
|
| 313 |
+
|
| 314 |
if image_data:
|
| 315 |
# Decode base64 image for Gemini
|
| 316 |
image_bytes = base64.b64decode(image_data)
|
| 317 |
pil_image = Image.open(BytesIO(image_bytes))
|
| 318 |
|
| 319 |
response = await asyncio.to_thread(
|
| 320 |
+
self.client.generate_content, [prompt, pil_image], generation_config=generation_config
|
| 321 |
)
|
| 322 |
else:
|
| 323 |
response = await asyncio.to_thread(
|
| 324 |
+
self.client.generate_content, prompt, generation_config=generation_config
|
| 325 |
)
|
| 326 |
|
| 327 |
return response.text
|
prompts/stage1_global.txt
CHANGED
|
@@ -1,9 +1,11 @@
|
|
| 1 |
-
Analyze this
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
|
|
|
| 6 |
4. Recommended analysis approach
|
| 7 |
|
| 8 |
Focus on: [DYNAMIC_CONTEXT]
|
| 9 |
-
|
|
|
|
|
|
| 1 |
+
Analyze this microscopy image and provide a detailed description.
|
| 2 |
+
|
| 3 |
+
Please describe:
|
| 4 |
+
1. Overall image quality and clarity
|
| 5 |
+
2. Visible cellular structures
|
| 6 |
+
3. Any notable features or patterns
|
| 7 |
4. Recommended analysis approach
|
| 8 |
|
| 9 |
Focus on: [DYNAMIC_CONTEXT]
|
| 10 |
+
|
| 11 |
+
Provide your response in a clear, structured format with detailed biological insights.
|
prompts/stage2_objects.txt
CHANGED
|
@@ -1,7 +1,9 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
1.
|
| 5 |
-
2.
|
| 6 |
-
3.
|
| 7 |
-
|
|
|
|
|
|
|
|
|
| 1 |
+
Based on this microscopy image, identify and describe all visible cellular objects and structures.
|
| 2 |
+
|
| 3 |
+
Please identify:
|
| 4 |
+
1. Number and types of cells visible
|
| 5 |
+
2. Subcellular structures (nuclei, organelles, etc.)
|
| 6 |
+
3. Cell boundaries and morphology
|
| 7 |
+
4. Any specific features related to the experimental context
|
| 8 |
+
|
| 9 |
+
Provide a structured list of detected objects with confidence estimates.
|
prompts/stage3_features.txt
CHANGED
|
@@ -1,14 +1,10 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
1. Feature descriptions with confidence scores
|
| 12 |
-
2. CMPO term mappings
|
| 13 |
-
3. Supporting visual evidence
|
| 14 |
-
Output format: {features: [{name: str, confidence: float, evidence: str, cmpo_id: str}]}
|
|
|
|
| 1 |
+
Analyze the morphological and textural features of the cellular structures in this microscopy image.
|
| 2 |
+
|
| 3 |
+
Focus on:
|
| 4 |
+
1. Cell shape characteristics and size variations
|
| 5 |
+
2. Nuclear morphology and chromatin patterns
|
| 6 |
+
3. Cytoplasmic texture and organization
|
| 7 |
+
4. Protein localization patterns (if fluorescent)
|
| 8 |
+
5. Any pathological or experimental features
|
| 9 |
+
|
| 10 |
+
Describe specific features with confidence scores and biological significance.
|
|
|
|
|
|
|
|
|
|
|
|
prompts/stage4_population.txt
CHANGED
|
@@ -1,10 +1,10 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
|
|
|
| 1 |
+
Provide population-level analysis of the cells in this microscopy image.
|
| 2 |
+
|
| 3 |
+
Analyze:
|
| 4 |
+
1. Overall cellular health and viability
|
| 5 |
+
2. Population heterogeneity
|
| 6 |
+
3. Distribution of phenotypic characteristics
|
| 7 |
+
4. Experimental readout interpretation
|
| 8 |
+
5. Statistical observations about the cell population
|
| 9 |
+
|
| 10 |
+
Provide insights relevant to the experimental goals and biological context.
|