Update README.md
Browse files
README.md
CHANGED
|
@@ -1,6 +1,8 @@
|
|
| 1 |
---
|
| 2 |
license: unknown
|
| 3 |
---
|
|
|
|
|
|
|
| 4 |
|
| 5 |
To generate the embeddings use ```google/siglip2-so400m-patch16-512``` with no fine tuning.
|
| 6 |
You have a tiny script in generate_embeddings.py to generate a pickle file with the embeddings, provided a Pandas DataFrame ```tasks_df``` with a col ```"image_path"``` that contains all the images paths.
|
|
|
|
| 1 |
---
|
| 2 |
license: unknown
|
| 3 |
---
|
| 4 |
+
The purpose of this model is to classify a single document page image to define if it is the beginning page of the document or the middle/end page of a document.
|
| 5 |
+
Single-page documents are classified as beginning page. It is a first step of the more general document boundary classification problem.
|
| 6 |
|
| 7 |
To generate the embeddings use ```google/siglip2-so400m-patch16-512``` with no fine tuning.
|
| 8 |
You have a tiny script in generate_embeddings.py to generate a pickle file with the embeddings, provided a Pandas DataFrame ```tasks_df``` with a col ```"image_path"``` that contains all the images paths.
|