tchung1970 Claude Opus 4.5 commited on
Commit
c38f72a
·
1 Parent(s): 5c03551

Update CLAUDE.md with remote text encoder documentation

Browse files

- Document HF official remote text encoder endpoint
- Explain why local text encoders can't be used (storage limit)
- Update MAX_IMAGE_SIZE to 2048

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Files changed (1) hide show
  1. CLAUDE.md +3 -3
CLAUDE.md CHANGED
@@ -33,7 +33,7 @@ The app runs on Hugging Face Spaces with ZeroGPU infrastructure. Requires `HF_TO
33
 
34
  ### Key Pipeline Details
35
 
36
- 1. **Text Encoding**: Offloaded to remote Gradio client (`multimodalart/mistral-text-encoder`) - runs on CPU, network-bound
37
  2. **Prompt Upsampling**: Uses ERNIE-4.5-VL via Hugging Face Inference API - two modes:
38
  - Text-only: Enhances prompts with visual details
39
  - Image+text: Converts editing requests into concise instructions
@@ -41,6 +41,6 @@ The app runs on Hugging Face Spaces with ZeroGPU infrastructure. Requires `HF_TO
41
 
42
  ### Configuration Constants
43
 
44
- - `MAX_IMAGE_SIZE`: 1024
45
  - `dtype`: torch.bfloat16
46
- - Dimensions auto-adjust to uploaded image aspect ratio (multiples of 8, min 256, max 1024)
 
33
 
34
  ### Key Pipeline Details
35
 
36
+ 1. **Text Encoding**: Uses Hugging Face's official remote text encoder service at `https://remote-text-encoder-flux-2.huggingface.co/predict`. This is required because loading local text encoders (T5-XXL ~10GB + CLIP ~1.5GB) exceeds ZeroGPU's storage limit. Requires `HF_TOKEN` for authentication.
37
  2. **Prompt Upsampling**: Uses ERNIE-4.5-VL via Hugging Face Inference API - two modes:
38
  - Text-only: Enhances prompts with visual details
39
  - Image+text: Converts editing requests into concise instructions
 
41
 
42
  ### Configuration Constants
43
 
44
+ - `MAX_IMAGE_SIZE`: 2048
45
  - `dtype`: torch.bfloat16
46
+ - Default aspect ratio: 1:1 (2048x2048)