Spaces:
Running
Running
Expand org README: community + open-foundations framing
#2
by davanstrien HF Staff - opened
README.md
CHANGED
|
@@ -6,24 +6,28 @@ colorTo: indigo
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
-
|
| 10 |
-
<br>
|
| 11 |
-
<br>
|
| 12 |
<div align="center">
|
| 13 |
-
<
|
| 14 |
-
<br>
|
| 15 |
|
| 16 |

|
| 17 |
|
| 18 |
-
<
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
+
|
|
|
|
|
|
|
| 10 |
<div align="center">
|
| 11 |
+
<h1>Small Models for GLAM</h1>
|
|
|
|
| 12 |
|
| 13 |

|
| 14 |
|
| 15 |
+
</div>
|
| 16 |
+
|
| 17 |
+
Most of what gets done in libraries, archives and museums runs on a long tail of small, repetitive jobs β backlogs to clear, scans to make searchable, metadata to tidy. A good chunk of that work can be handled by small, task-specific models, and the people who know what those tasks are are the people working in those institutions.
|
| 18 |
+
|
| 19 |
+
This org is a place to put the models that come out of that work, so the next institution facing the same problem doesn't start from scratch.
|
| 20 |
+
|
| 21 |
+
Each model here builds on something. Most are fine-tunes of open foundation models β YOLO, DETR, BERT, Qwen-VL β trained on community datasets, often from [BigLAM](https://huggingface.co/biglam) or contributed by individual institutions. Several extend existing community-trained models for new collections rather than starting over: [index-card-detector-v5](https://huggingface.co/small-models-for-glam/index-card-detector-v5) takes the National Library of Scotland's archival card detector and extends it to three additional archives. That extension pattern matters β it's how this kind of work gets cheaper for everyone over time.
|
| 22 |
+
|
| 23 |
+
Recipes for most of the models live in [AI Patterns for GLAM](https://danielvanstrien.xyz/ai-patterns-for-glam/); [The Case for Boring AI](https://danielvanstrien.xyz/ai-patterns-for-glam/discovery/boring-ai.html) and [Beyond Chatbots](https://danielvanstrien.xyz/ai-patterns-for-glam/discovery/beyond-chatbots.html) set out the why.
|
| 24 |
+
|
| 25 |
+
## How the models get built
|
| 26 |
+
|
| 27 |
+
Mostly with agentic workflows: an agent handles the data prep, training, and packaging; a human stays in the loop for the parts that matter β label review, evaluation, deciding whether something is good enough to release.
|
| 28 |
+
|
| 29 |
+
## Share a model, or suggest one
|
| 30 |
+
|
| 31 |
+
If you've trained a small task-specific model for your own collection, share it in [Discussions](https://huggingface.co/spaces/small-models-for-glam/README/discussions) and we'll add good ones to a curated collection so other institutions can find them. Suggestions for tasks you'd like to see covered are welcome there too.
|
| 32 |
+
|
| 33 |
+
Maintained by [Daniel van Strien](https://huggingface.co/davanstrien) and [William Mattingly](https://huggingface.co/wjbmattingly), with contributions and datasets from across the GLAM ML community.
|