tips of memory gpu
Browse files
README.md
CHANGED
|
@@ -213,6 +213,12 @@ print(generated_texts)
|
|
| 213 |
|
| 214 |
# Model optimizations
|
| 215 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 216 |
**Using Flash-attention 2 to speed up generation**
|
| 217 |
|
| 218 |
<details><summary>Click to expand.</summary>
|
|
|
|
| 213 |
|
| 214 |
# Model optimizations
|
| 215 |
|
| 216 |
+
**Vision encoder efficiency**
|
| 217 |
+
|
| 218 |
+
Given the high resolution supported, the vision part of the model can be memory hungry depending on your configuration. If you are GPU-memory-constrained, you can:
|
| 219 |
+
- **deactivate the image splitting.** To do so, add `do_image_splitting=False` when initializing the processor (`AutoProcessor.from_pretrained`). There are no changes required on the model side. Note that only the sft model has been trained with image splitting.
|
| 220 |
+
- **decrease the maximum image resolution.** To do so, add `size= {"longest_edge": 448, "shortest_edge": 378}` when initializing the processor (`AutoProcessor.from_pretrained`). In particular, the `longest_edge` value can be adapted to fit the need. We recommend using values that are multiples of 14. There are no changes required on the model side.
|
| 221 |
+
|
| 222 |
**Using Flash-attention 2 to speed up generation**
|
| 223 |
|
| 224 |
<details><summary>Click to expand.</summary>
|