Spaces:
Running
Running
Sync from GitHub via hub-sync
Browse files
README.md
CHANGED
|
@@ -11,21 +11,17 @@ pinned: false
|
|
| 11 |
|
| 12 |
**Run a data or ML task over a Hugging Face dataset in one command β for humans and agents.**
|
| 13 |
|
| 14 |
-
Each recipe is a single self-contained [UV script](https://docs.astral.sh/uv/guides/scripts/): dependencies are declared inline, so you run it straight from a URL
|
| 15 |
|
| 16 |
-
##
|
| 17 |
|
| 18 |
-
|
| 19 |
|
| 20 |
```bash
|
| 21 |
uv run https://huggingface.co/datasets/uv-scripts/jobs-utils/raw/main/list-recipes.py
|
| 22 |
```
|
| 23 |
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
## Run one for real
|
| 27 |
-
|
| 28 |
-
The flagship is **[OCR](https://huggingface.co/datasets/uv-scripts/ocr)** β turn an image dataset into text & structured data, 30+ models. On a managed GPU (no hardware of your own; pay-per-second):
|
| 29 |
|
| 30 |
```bash
|
| 31 |
hf jobs uv run --flavor l4x1 --secrets HF_TOKEN \
|
|
@@ -33,11 +29,12 @@ hf jobs uv run --flavor l4x1 --secrets HF_TOKEN \
|
|
| 33 |
davanstrien/ufo-ColPali your-username/ufo-ocr --max-samples 10
|
| 34 |
```
|
| 35 |
|
| 36 |
-
One command β a new dataset with a `markdown` column.
|
| 37 |
|
| 38 |
-
|
|
|
|
| 39 |
|
| 40 |
-
Recipes
|
| 41 |
|
| 42 |
**Try it now** β runs a real OCR job and hands back a dataset:
|
| 43 |
|
|
@@ -60,8 +57,10 @@ Pick the one that fits, read its script header for the arguments, and run it wit
|
|
| 60 |
Each recipe reads a Hub dataset and writes a new one, so chain them as needed.
|
| 61 |
```
|
| 62 |
|
| 63 |
-
|
|
|
|
|
|
|
| 64 |
|
| 65 |
-
##
|
| 66 |
|
| 67 |
-
Every
|
|
|
|
| 11 |
|
| 12 |
**Run a data or ML task over a Hugging Face dataset in one command β for humans and agents.**
|
| 13 |
|
| 14 |
+
Each recipe is a single self-contained [UV script](https://docs.astral.sh/uv/guides/scripts/): dependencies are declared inline, so you run it straight from a URL β no clone, no virtualenv, no `pip install`. Run it locally with `uv run`, or hand it to [Hugging Face Jobs](https://huggingface.co/docs/hub/jobs) for a managed GPU. Most recipes read a Hub dataset and write a new one, so they chain into pipelines.
|
| 15 |
|
| 16 |
+
## Quickstart
|
| 17 |
|
| 18 |
+
**See every recipe** β locally, no GPU or token:
|
| 19 |
|
| 20 |
```bash
|
| 21 |
uv run https://huggingface.co/datasets/uv-scripts/jobs-utils/raw/main/list-recipes.py
|
| 22 |
```
|
| 23 |
|
| 24 |
+
**Run one on a GPU** β the flagship, OCR an image dataset to text:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
```bash
|
| 27 |
hf jobs uv run --flavor l4x1 --secrets HF_TOKEN \
|
|
|
|
| 29 |
davanstrien/ufo-ColPali your-username/ufo-ocr --max-samples 10
|
| 30 |
```
|
| 31 |
|
| 32 |
+
One command β a new dataset with a `markdown` column. Pay-per-second, no hardware of your own.
|
| 33 |
|
| 34 |
+
<details>
|
| 35 |
+
<summary><b>Drive it with your coding agent β</b></summary>
|
| 36 |
|
| 37 |
+
Recipes take their arguments in the same `input output` order and run from a URL, so an agent can pick one and run it with no setup. Paste into Claude Code, Cursor, or similar:
|
| 38 |
|
| 39 |
**Try it now** β runs a real OCR job and hands back a dataset:
|
| 40 |
|
|
|
|
| 57 |
Each recipe reads a Hub dataset and writes a new one, so chain them as needed.
|
| 58 |
```
|
| 59 |
|
| 60 |
+
The cookbook also ships a ready-made **agent skill** for discovering and running recipes β see the [GitHub repo](https://github.com/davanstrien/uv-scripts-for-ai), and Hugging Face's own [`hf` CLI skill for agents](https://huggingface.co/docs/hub/agents-cli). _(We'll refine these prompts over time.)_
|
| 61 |
+
|
| 62 |
+
</details>
|
| 63 |
|
| 64 |
+
## Browse
|
| 65 |
|
| 66 |
+
Every recipe is in the list below β OCR, detection & segmentation, audio transcription, NER & classification, embeddings & atlas maps, batch LLM/VLM inference, synthetic data, and dataset creation. Or browse on **[GitHub](https://github.com/davanstrien/uv-scripts-for-ai)** Β· run `hf jobs hardware` for GPU flavors & pricing.
|