rafmacalaba commited on
Commit
fb7143c
·
1 Parent(s): 64b4b2f

add gradio pdf and readme

Browse files
Files changed (2) hide show
  1. README.md +10 -0
  2. requirements.txt +2 -0
README.md CHANGED
@@ -51,9 +51,19 @@ To enable automatic cloud backup:
51
  3. In your Space settings, add these secrets:
52
  - `HF_TOKEN`: Your Hugging Face write token
53
  - `HF_DATASET_REPO`: Your dataset repository (e.g., `username/reliefweb-annotations`)
 
54
 
55
  Once configured, annotations are automatically pushed to your HF Dataset after each annotation!
56
 
 
 
 
 
 
 
 
 
 
57
  ## Data
58
 
59
  This tool processes validation samples from stratified sampling of UNHCR/ReliefWeb documents, comparing:
 
51
  3. In your Space settings, add these secrets:
52
  - `HF_TOKEN`: Your Hugging Face write token
53
  - `HF_DATASET_REPO`: Your dataset repository (e.g., `username/reliefweb-annotations`)
54
+ - `HF_RELIEFWEB_PDFS_REPO`: PDF dataset repository (e.g., `ai4data/reliefweb-pdfs`)
55
 
56
  Once configured, annotations are automatically pushed to your HF Dataset after each annotation!
57
 
58
+ ## PDF Viewer
59
+
60
+ The tool includes an embedded PDF viewer powered by `gradio-pdf`. PDFs can be sourced from:
61
+
62
+ - **Local files**: Use `--pdf-dir /path/to/pdfs` when running locally
63
+ - **Hugging Face Datasets**: Set `HF_RELIEFWEB_PDFS_REPO` environment variable
64
+
65
+ The PDF viewer automatically navigates to the page containing the dataset mention.
66
+
67
  ## Data
68
 
69
  This tool processes validation samples from stratified sampling of UNHCR/ReliefWeb documents, comparing:
requirements.txt CHANGED
@@ -1,3 +1,5 @@
1
  gradio>=4.0.0
 
2
  huggingface_hub>=0.20.0
3
  datasets>=2.16.0
 
 
1
  gradio>=4.0.0
2
+ gradio-pdf>=0.0.4
3
  huggingface_hub>=0.20.0
4
  datasets>=2.16.0
5
+ python-dotenv>=1.0.0