nightknocker commited on
Commit
72a0e8d
·
verified ·
1 Parent(s): fff074b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -34,6 +34,12 @@ It was trained on both T5 (text) and the [AnimaTextToImagePipeline](https://hugg
34
  ## Z-Image and Qwen
35
 
36
  - LLMs have redundant knowledge (2511.07384, 2403.03853). Thus, resorting to smaller language models does not result in irrecoverable knowledge loss, as has been [demonstrated](https://huggingface.co/nightknocker/recurrent-qwen3-z-image-turbo). This is particularly true for specialized anime models.
 
 
 
 
 
 
37
 
38
  ## Inference
39
 
 
34
  ## Z-Image and Qwen
35
 
36
  - LLMs have redundant knowledge (2511.07384, 2403.03853). Thus, resorting to smaller language models does not result in irrecoverable knowledge loss, as has been [demonstrated](https://huggingface.co/nightknocker/recurrent-qwen3-z-image-turbo). This is particularly true for specialized anime models.
37
+
38
+ ## Subject-Focused Attention
39
+
40
+ In an SVO sentence structure, CLIPs focus too much on the subject, text encoders are undertrained for certain verbs and cannot reliably identify the object's position.
41
+
42
+ This repo is an experiment to address these issues. The spatial knowledge is explicitly encoded, so the attention modules are not overwhelmed by the task.
43
 
44
  ## Inference
45