Spaces:
Sleeping
Sleeping
Commit
·
fb7143c
1
Parent(s):
64b4b2f
add gradio pdf and readme
Browse files- README.md +10 -0
- requirements.txt +2 -0
README.md
CHANGED
|
@@ -51,9 +51,19 @@ To enable automatic cloud backup:
|
|
| 51 |
3. In your Space settings, add these secrets:
|
| 52 |
- `HF_TOKEN`: Your Hugging Face write token
|
| 53 |
- `HF_DATASET_REPO`: Your dataset repository (e.g., `username/reliefweb-annotations`)
|
|
|
|
| 54 |
|
| 55 |
Once configured, annotations are automatically pushed to your HF Dataset after each annotation!
|
| 56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
## Data
|
| 58 |
|
| 59 |
This tool processes validation samples from stratified sampling of UNHCR/ReliefWeb documents, comparing:
|
|
|
|
| 51 |
3. In your Space settings, add these secrets:
|
| 52 |
- `HF_TOKEN`: Your Hugging Face write token
|
| 53 |
- `HF_DATASET_REPO`: Your dataset repository (e.g., `username/reliefweb-annotations`)
|
| 54 |
+
- `HF_RELIEFWEB_PDFS_REPO`: PDF dataset repository (e.g., `ai4data/reliefweb-pdfs`)
|
| 55 |
|
| 56 |
Once configured, annotations are automatically pushed to your HF Dataset after each annotation!
|
| 57 |
|
| 58 |
+
## PDF Viewer
|
| 59 |
+
|
| 60 |
+
The tool includes an embedded PDF viewer powered by `gradio-pdf`. PDFs can be sourced from:
|
| 61 |
+
|
| 62 |
+
- **Local files**: Use `--pdf-dir /path/to/pdfs` when running locally
|
| 63 |
+
- **Hugging Face Datasets**: Set `HF_RELIEFWEB_PDFS_REPO` environment variable
|
| 64 |
+
|
| 65 |
+
The PDF viewer automatically navigates to the page containing the dataset mention.
|
| 66 |
+
|
| 67 |
## Data
|
| 68 |
|
| 69 |
This tool processes validation samples from stratified sampling of UNHCR/ReliefWeb documents, comparing:
|
requirements.txt
CHANGED
|
@@ -1,3 +1,5 @@
|
|
| 1 |
gradio>=4.0.0
|
|
|
|
| 2 |
huggingface_hub>=0.20.0
|
| 3 |
datasets>=2.16.0
|
|
|
|
|
|
| 1 |
gradio>=4.0.0
|
| 2 |
+
gradio-pdf>=0.0.4
|
| 3 |
huggingface_hub>=0.20.0
|
| 4 |
datasets>=2.16.0
|
| 5 |
+
python-dotenv>=1.0.0
|