correct readme for image bbox output format
Browse files
README.md
CHANGED
|
@@ -69,7 +69,7 @@ LightOnOCR-2 is an efficient end-to-end 1B-parameter vision-language model for c
|
|
| 69 |
The output format for embedded images is:
|
| 70 |
|
| 71 |
```
|
| 72 |
-

|
| 73 |
```
|
| 74 |
|
| 75 |
Where coordinates are normalized to `[0, 1000]`.
|
|
@@ -122,7 +122,7 @@ output_ids = model.generate(**inputs, max_new_tokens=1024)
|
|
| 122 |
generated_ids = output_ids[0, inputs["input_ids"].shape[1]:]
|
| 123 |
output_text = processor.decode(generated_ids, skip_special_tokens=True)
|
| 124 |
print(output_text)
|
| 125 |
-
# Output will include bounding boxes like: 
|
| 126 |
```
|
| 127 |
|
| 128 |
---
|
|
|
|
| 69 |
The output format for embedded images is:
|
| 70 |
|
| 71 |
```
|
| 72 |
+
x1,y1,x2,y2
|
| 73 |
```
|
| 74 |
|
| 75 |
Where coordinates are normalized to `[0, 1000]`.
|
|
|
|
| 122 |
generated_ids = output_ids[0, inputs["input_ids"].shape[1]:]
|
| 123 |
output_text = processor.decode(generated_ids, skip_special_tokens=True)
|
| 124 |
print(output_text)
|
| 125 |
+
# Output will include bounding boxes like this(no space): 120,50,850,400
|
| 126 |
```
|
| 127 |
|
| 128 |
---
|