AdamTT commited on
Commit
9d9b8ca
·
verified ·
1 Parent(s): de3dea2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +157 -1
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: Model Fit Finder
3
- emoji: 📉
4
  colorFrom: red
5
  colorTo: red
6
  sdk: gradio
@@ -8,6 +8,162 @@ sdk_version: 6.4.0
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
 
11
  ---
12
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Model Fit Finder
3
+ emoji: 👀
4
  colorFrom: red
5
  colorTo: red
6
  sdk: gradio
 
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
+ short_description: Space that helps you choose the right type of NLP model
12
  ---
13
 
14
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
+
16
+ # Model Fit Finder (CPU)
17
+
18
+ **Model Fit Finder** is a decision-support Space that helps you choose the **right type of NLP model** and **concrete Hugging Face models** for your task — without training, without GPU, and without guesswork.
19
+
20
+ The Space is designed to reflect real-world AI engineering decisions rather than showcase a single model demo.
21
+
22
+ ---
23
+
24
+ ## What this Space does
25
+
26
+ The Space guides the user through a small set of practical questions and then:
27
+
28
+ * identifies the **appropriate model category** (instruction, QA, embeddings),
29
+ * ranks and recommends **at least 3 concrete Hugging Face models**,
30
+ * explains **why these models were selected**,
31
+ * adapts recommendations based on **language, compute budget, and priority**,
32
+ * optionally pulls **up-to-date models directly from Hugging Face Hub**.
33
+
34
+ All recommendations are **CPU-friendly** and suitable for lightweight prototyping and production planning.
35
+
36
+ ---
37
+
38
+ ## Supported NLP tasks
39
+
40
+ The Space currently supports three common NLP problem types:
41
+
42
+ ### 1. Chat / instruction-following / generation
43
+
44
+ For tasks such as:
45
+
46
+ * chatbots
47
+ * summarization
48
+ * explanation
49
+ * instruction-based text processing
50
+
51
+ Recommended models are **instruction-tuned text-to-text or generative models**.
52
+
53
+ ---
54
+
55
+ ### 2. Question Answering from documents (extractive QA)
56
+
57
+ For tasks where:
58
+
59
+ * you have a document or text,
60
+ * answers must come strictly from that text,
61
+ * hallucinations should be minimized.
62
+
63
+ Recommended models are **extractive QA models** fine-tuned on datasets like SQuAD.
64
+
65
+ ---
66
+
67
+ ### 3. Semantic similarity / search / deduplication
68
+
69
+ For tasks such as:
70
+
71
+ * finding semantically similar texts,
72
+ * detecting near-duplicates,
73
+ * semantic search,
74
+ * retrieval for RAG pipelines.
75
+
76
+ Recommended models are **embedding (sentence similarity) models**.
77
+
78
+ ---
79
+
80
+ ## How recommendations are generated
81
+
82
+ Recommendations are **not static**. The Space uses a simple but explicit decision logic based on:
83
+
84
+ * **Data language**
85
+
86
+ * EN
87
+ * PL
88
+ * Mixed / multilingual
89
+ * **Compute budget**
90
+
91
+ * Low (fast, small models)
92
+ * Medium (allows larger, higher-quality models)
93
+ * **Priority**
94
+
95
+ * Speed
96
+ * Quality
97
+ * **Model source**
98
+
99
+ * Curated (hand-picked, stable baseline)
100
+ * HF Live (fresh models from Hugging Face Hub)
101
+ * Hybrid (curated + live)
102
+
103
+ Each candidate model is scored using heuristics such as:
104
+
105
+ * model size (small vs base),
106
+ * language coverage (English vs multilingual),
107
+ * suitability for the selected budget and priority,
108
+ * stability (curated vs live).
109
+
110
+ The Space always returns **a minimum of three models**.
111
+
112
+ ---
113
+
114
+ ## Hugging Face Live mode
115
+
116
+ When **HF Live** or **Hybrid** mode is enabled, the Space:
117
+
118
+ * queries the Hugging Face Hub using task-specific pipeline tags,
119
+ * ranks models by popularity (downloads),
120
+ * applies language and budget heuristics,
121
+ * caches results locally (with TTL),
122
+ * allows manual refresh via a **“Refresh HF cache”** button.
123
+
124
+ This prevents the Space from becoming outdated while keeping results stable and interpretable.
125
+
126
+ ---
127
+
128
+ ## What this Space is (and is not)
129
+
130
+ **This Space is:**
131
+
132
+ * a model selection assistant,
133
+ * a practical decision tool,
134
+ * CPU-only and cost-free,
135
+ * suitable for engineers, analysts, and ML practitioners.
136
+
137
+ **This Space is not:**
138
+
139
+ * a chatbot demo,
140
+ * a benchmark leaderboard,
141
+ * an automatic “best model” oracle.
142
+
143
+ Its goal is to help you make **better-informed model choices**, not to hide trade-offs.
144
+
145
+ ---
146
+
147
+ ## Example use cases
148
+
149
+ * *“Which embedding model should I use to detect semantically similar Revit Key Notes?”*
150
+ * *“I have a policy document and want reliable question answering without hallucinations.”*
151
+ * *“I need a lightweight instruction-following model for short summaries on CPU.”*
152
+ * *“Which models make sense for Polish or mixed-language text?”*
153
+
154
+ ---
155
+
156
+ ## Technical notes
157
+
158
+ * No model training is performed.
159
+ * No GPU is required.
160
+ * All logic runs on CPU.
161
+ * Model recommendations are based on metadata, heuristics, and Hugging Face Hub signals.
162
+
163
+ ---
164
+
165
+ ## Why this Space exists
166
+
167
+ Choosing the right model is often harder than using one.
168
+
169
+ This Space focuses on **model selection reasoning** — the part that usually lives only in engineers’ heads — and makes it explicit, inspectable, and reusable.