Spaces:

tech-doc
/

SkinProAI

Sleeping

cgoodmaker Claude Opus 4.6 commited on Feb 26

Commit

0989643

1 Parent(s): da343a7

Use bfloat16 on CPU to halve memory (8GB vs 16GB float32)

Free-tier HF Spaces have 16GB RAM. MedGemma 4B in float32 consumed
~16GB for weights alone, causing swap thrashing. bfloat16 on CPU is
supported by modern PyTorch and halves memory to ~8GB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (1) hide show

models/medgemma_agent.py +1 -1

models/medgemma_agent.py CHANGED Viewed

@@ -255,7 +255,7 @@ class MedGemmaAgent:
             self._print("Using CPU")
         model_kwargs = dict(
-            dtype=torch.bfloat16 if device != "cpu" else torch.float32,
             device_map="auto",
         )

             self._print("Using CPU")
         model_kwargs = dict(
+            dtype=torch.bfloat16,  # bfloat16 on all devices (halves RAM on CPU: ~8GB vs ~16GB)
             device_map="auto",
         )