toni99c commited on
Commit
23c5705
·
verified ·
1 Parent(s): 7e0b4d3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
2
  license: unknown
3
  ---
 
 
4
 
5
  To generate the embeddings use ```google/siglip2-so400m-patch16-512``` with no fine tuning.
6
  You have a tiny script in generate_embeddings.py to generate a pickle file with the embeddings, provided a Pandas DataFrame ```tasks_df``` with a col ```"image_path"``` that contains all the images paths.
 
1
  ---
2
  license: unknown
3
  ---
4
+ The purpose of this model is to classify a single document page image to define if it is the beginning page of the document or the middle/end page of a document.
5
+ Single-page documents are classified as beginning page. It is a first step of the more general document boundary classification problem.
6
 
7
  To generate the embeddings use ```google/siglip2-so400m-patch16-512``` with no fine tuning.
8
  You have a tiny script in generate_embeddings.py to generate a pickle file with the embeddings, provided a Pandas DataFrame ```tasks_df``` with a col ```"image_path"``` that contains all the images paths.