gr8monk3ys commited on
Commit
11e4b92
·
verified ·
1 Parent(s): 211b3cb

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +78 -5
  2. app.py +397 -0
  3. requirements.txt +2 -0
README.md CHANGED
@@ -1,12 +1,85 @@
1
  ---
2
  title: Model Selector
3
- emoji: 📉
4
- colorFrom: indigo
5
- colorTo: gray
6
  sdk: gradio
7
- sdk_version: 6.5.1
 
8
  app_file: app.py
9
  pinned: false
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Model Selector
3
+ emoji: 🎯
4
+ colorFrom: yellow
5
+ colorTo: red
6
  sdk: gradio
7
+ sdk_version: 5.9.1
8
+ python_version: "3.10"
9
  app_file: app.py
10
  pinned: false
11
+ license: mit
12
+ short_description: Find the perfect HuggingFace model for your task
13
  ---
14
 
15
+ # Model Selector
16
+
17
+ Find the perfect HuggingFace model for your task. Answer a few simple questions and get personalized recommendations with ready-to-use code examples.
18
+
19
+ ## Features
20
+
21
+ ### 10 Task Categories
22
+ - **Text Generation** - Chatbots, content writing, code
23
+ - **Text Classification** - Sentiment, spam, topics
24
+ - **Question Answering** - Document QA, FAQs
25
+ - **Translation** - 200+ languages
26
+ - **Summarization** - Articles, documents
27
+ - **Image Classification** - Photos, medical images
28
+ - **Object Detection** - Detect objects in images
29
+ - **Image Generation** - Create images from text
30
+ - **Speech Recognition** - Audio to text
31
+ - **Embeddings** - Semantic search, RAG
32
+
33
+ ### Smart Filtering
34
+ - Filter by model size (tiny to large)
35
+ - Prioritize by speed, quality, or popularity
36
+ - Get recommendations tailored to your use case
37
+
38
+ ### Ready-to-Use Code
39
+ Every recommendation includes:
40
+ - Working Python code example
41
+ - Direct link to the model
42
+ - License information
43
+ - Size/speed tradeoffs
44
+
45
+ ## How to Use
46
+
47
+ 1. **Select your task** (e.g., Text Generation)
48
+ 2. **Choose size preference** based on your hardware
49
+ 3. **Set priority** (speed, quality, or popularity)
50
+ 4. **Describe your use case** (optional)
51
+ 5. **Get recommendations** with code examples!
52
+
53
+ ## Example Output
54
+
55
+ For "Text Generation" with "Small" size preference:
56
+
57
+ | Rank | Model | Size | License |
58
+ |------|-------|------|---------|
59
+ | 1 | microsoft/phi-3-mini | 3.8B | MIT |
60
+ | 2 | Qwen/Qwen2.5-3B-Instruct | 3B | Apache |
61
+ | 3 | mistralai/Mistral-7B | 7B | Apache |
62
+
63
+ ## Quick Reference
64
+
65
+ | Task | Typical Size | Best For |
66
+ |------|--------------|----------|
67
+ | Text Generation | 3B - 70B | Chatbots, content |
68
+ | Classification | 50M - 300M | Sentiment, spam |
69
+ | Embeddings | 20M - 100M | Search, RAG |
70
+ | Speech | 200M - 1.5B | Transcription |
71
+
72
+ ## Why Use This Tool?
73
+
74
+ - **Save time** - Don't search through thousands of models
75
+ - **Avoid mistakes** - Get proven, popular models
76
+ - **Quick start** - Copy-paste code examples
77
+ - **Right-sized** - Match models to your hardware
78
+
79
+ ## License
80
+
81
+ MIT
82
+
83
+ ## Author
84
+
85
+ Built by [Lorenzo Scaturchio](https://huggingface.co/gr8monk3ys)
app.py ADDED
@@ -0,0 +1,397 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Model Selector - Find the right HuggingFace model for your task.
3
+
4
+ Answer a few questions and get personalized model recommendations.
5
+ """
6
+
7
+ import gradio as gr
8
+ from huggingface_hub import HfApi, list_models
9
+ from typing import Optional
10
+
11
+ # ---------------------------------------------------------------------------
12
+ # Task Categories and Model Recommendations
13
+ # ---------------------------------------------------------------------------
14
+
15
+ TASKS = {
16
+ "Text Generation": {
17
+ "id": "text-generation",
18
+ "description": "Generate text, stories, code, or continue prompts",
19
+ "use_cases": ["Chatbots", "Content writing", "Code completion", "Story generation"],
20
+ "top_models": [
21
+ {"name": "meta-llama/Llama-3.1-8B-Instruct", "size": "8B", "license": "llama3.1"},
22
+ {"name": "mistralai/Mistral-7B-Instruct-v0.3", "size": "7B", "license": "apache-2.0"},
23
+ {"name": "Qwen/Qwen2.5-7B-Instruct", "size": "7B", "license": "apache-2.0"},
24
+ {"name": "google/gemma-2-9b-it", "size": "9B", "license": "gemma"},
25
+ {"name": "microsoft/phi-3-mini-4k-instruct", "size": "3.8B", "license": "mit"},
26
+ ]
27
+ },
28
+ "Text Classification": {
29
+ "id": "text-classification",
30
+ "description": "Classify text into categories (sentiment, topic, intent)",
31
+ "use_cases": ["Sentiment analysis", "Spam detection", "Topic classification", "Intent detection"],
32
+ "top_models": [
33
+ {"name": "distilbert-base-uncased-finetuned-sst-2-english", "size": "67M", "license": "apache-2.0"},
34
+ {"name": "cardiffnlp/twitter-roberta-base-sentiment-latest", "size": "125M", "license": "mit"},
35
+ {"name": "facebook/bart-large-mnli", "size": "400M", "license": "mit"},
36
+ {"name": "MoritzLaworoutedistilbert-base-uncased-sentiment", "size": "67M", "license": "apache-2.0"},
37
+ ]
38
+ },
39
+ "Question Answering": {
40
+ "id": "question-answering",
41
+ "description": "Answer questions based on context or knowledge",
42
+ "use_cases": ["Customer support", "Document QA", "Knowledge retrieval", "FAQ bots"],
43
+ "top_models": [
44
+ {"name": "deepset/roberta-base-squad2", "size": "125M", "license": "cc-by-4.0"},
45
+ {"name": "distilbert-base-cased-distilled-squad", "size": "67M", "license": "apache-2.0"},
46
+ {"name": "google/flan-t5-base", "size": "250M", "license": "apache-2.0"},
47
+ {"name": "Intel/dynamic_tinybert", "size": "15M", "license": "apache-2.0"},
48
+ ]
49
+ },
50
+ "Translation": {
51
+ "id": "translation",
52
+ "description": "Translate text between languages",
53
+ "use_cases": ["Multilingual apps", "Document translation", "Real-time translation"],
54
+ "top_models": [
55
+ {"name": "facebook/nllb-200-distilled-600M", "size": "600M", "license": "cc-by-nc-4.0"},
56
+ {"name": "Helsinki-NLP/opus-mt-en-de", "size": "74M", "license": "apache-2.0"},
57
+ {"name": "google/madlad400-3b-mt", "size": "3B", "license": "apache-2.0"},
58
+ {"name": "facebook/mbart-large-50-many-to-many-mmt", "size": "611M", "license": "mit"},
59
+ ]
60
+ },
61
+ "Summarization": {
62
+ "id": "summarization",
63
+ "description": "Summarize long documents or articles",
64
+ "use_cases": ["News summarization", "Document condensing", "Meeting notes", "Research papers"],
65
+ "top_models": [
66
+ {"name": "facebook/bart-large-cnn", "size": "400M", "license": "mit"},
67
+ {"name": "google/pegasus-xsum", "size": "568M", "license": "apache-2.0"},
68
+ {"name": "philschmid/bart-large-cnn-samsum", "size": "400M", "license": "mit"},
69
+ {"name": "google/flan-t5-large", "size": "780M", "license": "apache-2.0"},
70
+ ]
71
+ },
72
+ "Image Classification": {
73
+ "id": "image-classification",
74
+ "description": "Classify images into categories",
75
+ "use_cases": ["Product categorization", "Medical imaging", "Quality control", "Content moderation"],
76
+ "top_models": [
77
+ {"name": "google/vit-base-patch16-224", "size": "86M", "license": "apache-2.0"},
78
+ {"name": "microsoft/resnet-50", "size": "25M", "license": "apache-2.0"},
79
+ {"name": "facebook/convnext-base-224", "size": "88M", "license": "apache-2.0"},
80
+ {"name": "timm/efficientnet_b0.ra_in1k", "size": "5M", "license": "apache-2.0"},
81
+ ]
82
+ },
83
+ "Object Detection": {
84
+ "id": "object-detection",
85
+ "description": "Detect and locate objects in images",
86
+ "use_cases": ["Autonomous vehicles", "Security cameras", "Inventory management", "Sports analytics"],
87
+ "top_models": [
88
+ {"name": "facebook/detr-resnet-50", "size": "41M", "license": "apache-2.0"},
89
+ {"name": "hustvl/yolos-tiny", "size": "6M", "license": "apache-2.0"},
90
+ {"name": "microsoft/table-transformer-detection", "size": "42M", "license": "mit"},
91
+ {"name": "facebook/detr-resnet-101", "size": "60M", "license": "apache-2.0"},
92
+ ]
93
+ },
94
+ "Image Generation": {
95
+ "id": "text-to-image",
96
+ "description": "Generate images from text descriptions",
97
+ "use_cases": ["Art creation", "Product visualization", "Marketing content", "Game assets"],
98
+ "top_models": [
99
+ {"name": "stabilityai/stable-diffusion-xl-base-1.0", "size": "6.9B", "license": "openrail++"},
100
+ {"name": "black-forest-labs/FLUX.1-schnell", "size": "12B", "license": "apache-2.0"},
101
+ {"name": "runwayml/stable-diffusion-v1-5", "size": "1B", "license": "creativeml-openrail-m"},
102
+ {"name": "stabilityai/sdxl-turbo", "size": "6.9B", "license": "openrail++"},
103
+ ]
104
+ },
105
+ "Speech Recognition": {
106
+ "id": "automatic-speech-recognition",
107
+ "description": "Convert speech to text",
108
+ "use_cases": ["Transcription", "Voice commands", "Meeting notes", "Accessibility"],
109
+ "top_models": [
110
+ {"name": "openai/whisper-large-v3", "size": "1.5B", "license": "apache-2.0"},
111
+ {"name": "openai/whisper-medium", "size": "769M", "license": "apache-2.0"},
112
+ {"name": "openai/whisper-small", "size": "244M", "license": "apache-2.0"},
113
+ {"name": "facebook/wav2vec2-base-960h", "size": "95M", "license": "apache-2.0"},
114
+ ]
115
+ },
116
+ "Embeddings": {
117
+ "id": "feature-extraction",
118
+ "description": "Generate embeddings for semantic search and similarity",
119
+ "use_cases": ["Semantic search", "Recommendation systems", "Clustering", "RAG systems"],
120
+ "top_models": [
121
+ {"name": "sentence-transformers/all-MiniLM-L6-v2", "size": "22M", "license": "apache-2.0"},
122
+ {"name": "sentence-transformers/all-mpnet-base-v2", "size": "109M", "license": "apache-2.0"},
123
+ {"name": "BAAI/bge-small-en-v1.5", "size": "33M", "license": "mit"},
124
+ {"name": "intfloat/e5-small-v2", "size": "33M", "license": "mit"},
125
+ ]
126
+ },
127
+ }
128
+
129
+ SIZE_PREFERENCES = {
130
+ "Tiny (< 100M)": {"min": 0, "max": 100},
131
+ "Small (100M - 500M)": {"min": 100, "max": 500},
132
+ "Medium (500M - 2B)": {"min": 500, "max": 2000},
133
+ "Large (2B - 10B)": {"min": 2000, "max": 10000},
134
+ "Any size": {"min": 0, "max": 100000},
135
+ }
136
+
137
+ # ---------------------------------------------------------------------------
138
+ # Core Functions
139
+ # ---------------------------------------------------------------------------
140
+
141
+ def get_recommendations(
142
+ task: str,
143
+ size_pref: str,
144
+ priority: str,
145
+ use_case: str
146
+ ) -> tuple[str, str]:
147
+ """Get model recommendations based on user preferences."""
148
+
149
+ if task not in TASKS:
150
+ return "Please select a task.", ""
151
+
152
+ task_info = TASKS[task]
153
+ models = task_info["top_models"]
154
+
155
+ # Filter by size if preference is set
156
+ size_range = SIZE_PREFERENCES.get(size_pref, SIZE_PREFERENCES["Any size"])
157
+
158
+ def parse_size(size_str):
159
+ """Parse size string to millions."""
160
+ size_str = size_str.upper()
161
+ if 'B' in size_str:
162
+ return float(size_str.replace('B', '')) * 1000
163
+ elif 'M' in size_str:
164
+ return float(size_str.replace('M', ''))
165
+ return 0
166
+
167
+ if size_pref != "Any size":
168
+ models = [m for m in models if size_range["min"] <= parse_size(m["size"]) <= size_range["max"]]
169
+
170
+ if not models:
171
+ return "No models match your size preference. Try 'Any size'.", ""
172
+
173
+ # Sort by priority
174
+ if priority == "Smallest/Fastest":
175
+ models = sorted(models, key=lambda x: parse_size(x["size"]))
176
+ elif priority == "Most Popular":
177
+ # Keep original order (already sorted by popularity)
178
+ pass
179
+ elif priority == "Best Quality":
180
+ # Larger models tend to be higher quality
181
+ models = sorted(models, key=lambda x: parse_size(x["size"]), reverse=True)
182
+
183
+ # Build recommendation output
184
+ recs = []
185
+ recs.append(f"## Recommendations for: {task}\n")
186
+ recs.append(f"*{task_info['description']}*\n")
187
+
188
+ if use_case:
189
+ recs.append(f"**Your use case:** {use_case}\n")
190
+
191
+ recs.append("---\n")
192
+
193
+ for i, model in enumerate(models[:4], 1):
194
+ recs.append(f"### {i}. {model['name']}")
195
+ recs.append(f"- **Size:** {model['size']} parameters")
196
+ recs.append(f"- **License:** {model['license']}")
197
+ recs.append(f"- **Link:** [View on HuggingFace](https://huggingface.co/{model['name']})")
198
+ recs.append("")
199
+
200
+ # Build code example
201
+ code = generate_code_example(task, models[0] if models else None)
202
+
203
+ return "\n".join(recs), code
204
+
205
+
206
+ def generate_code_example(task: str, model: Optional[dict]) -> str:
207
+ """Generate code example for using the recommended model."""
208
+
209
+ if not model:
210
+ return ""
211
+
212
+ model_name = model["name"]
213
+
214
+ code_templates = {
215
+ "Text Generation": f'''```python
216
+ from transformers import pipeline
217
+
218
+ generator = pipeline("text-generation", model="{model_name}")
219
+
220
+ result = generator(
221
+ "Write a story about a robot:",
222
+ max_length=100,
223
+ num_return_sequences=1
224
+ )
225
+ print(result[0]["generated_text"])
226
+ ```''',
227
+
228
+ "Text Classification": f'''```python
229
+ from transformers import pipeline
230
+
231
+ classifier = pipeline("text-classification", model="{model_name}")
232
+
233
+ result = classifier("I love this product! It's amazing!")
234
+ print(result) # [{{'label': 'POSITIVE', 'score': 0.99}}]
235
+ ```''',
236
+
237
+ "Question Answering": f'''```python
238
+ from transformers import pipeline
239
+
240
+ qa = pipeline("question-answering", model="{model_name}")
241
+
242
+ result = qa(
243
+ question="What is the capital of France?",
244
+ context="France is a country in Europe. Paris is its capital city."
245
+ )
246
+ print(result["answer"]) # Paris
247
+ ```''',
248
+
249
+ "Translation": f'''```python
250
+ from transformers import pipeline
251
+
252
+ translator = pipeline("translation", model="{model_name}")
253
+
254
+ result = translator("Hello, how are you?")
255
+ print(result[0]["translation_text"])
256
+ ```''',
257
+
258
+ "Summarization": f'''```python
259
+ from transformers import pipeline
260
+
261
+ summarizer = pipeline("summarization", model="{model_name}")
262
+
263
+ long_text = """Your long article text here..."""
264
+ result = summarizer(long_text, max_length=130, min_length=30)
265
+ print(result[0]["summary_text"])
266
+ ```''',
267
+
268
+ "Image Classification": f'''```python
269
+ from transformers import pipeline
270
+
271
+ classifier = pipeline("image-classification", model="{model_name}")
272
+
273
+ result = classifier("path/to/image.jpg")
274
+ print(result) # [{{'label': 'cat', 'score': 0.95}}]
275
+ ```''',
276
+
277
+ "Speech Recognition": f'''```python
278
+ from transformers import pipeline
279
+
280
+ transcriber = pipeline("automatic-speech-recognition", model="{model_name}")
281
+
282
+ result = transcriber("audio.mp3")
283
+ print(result["text"])
284
+ ```''',
285
+
286
+ "Embeddings": f'''```python
287
+ from sentence_transformers import SentenceTransformer
288
+
289
+ model = SentenceTransformer("{model_name}")
290
+
291
+ sentences = ["This is a sentence", "This is another sentence"]
292
+ embeddings = model.encode(sentences)
293
+ print(embeddings.shape) # (2, 384)
294
+ ```''',
295
+ }
296
+
297
+ return code_templates.get(task, f'''```python
298
+ from transformers import pipeline
299
+
300
+ pipe = pipeline("{TASKS[task]['id']}", model="{model_name}")
301
+ result = pipe("Your input here")
302
+ print(result)
303
+ ```''')
304
+
305
+
306
+ # ---------------------------------------------------------------------------
307
+ # Gradio Interface
308
+ # ---------------------------------------------------------------------------
309
+
310
+ with gr.Blocks(title="Model Selector", theme=gr.themes.Soft()) as demo:
311
+ gr.Markdown("""
312
+ # Model Selector
313
+
314
+ Find the perfect HuggingFace model for your task. Answer a few questions
315
+ and get personalized recommendations with code examples.
316
+ """)
317
+
318
+ with gr.Row():
319
+ with gr.Column(scale=1):
320
+ task_select = gr.Dropdown(
321
+ choices=list(TASKS.keys()),
322
+ label="What do you want to do?",
323
+ value="Text Generation"
324
+ )
325
+
326
+ task_description = gr.Markdown(
327
+ value=f"*{TASKS['Text Generation']['description']}*"
328
+ )
329
+
330
+ size_select = gr.Dropdown(
331
+ choices=list(SIZE_PREFERENCES.keys()),
332
+ label="Model size preference?",
333
+ value="Any size",
334
+ info="Smaller = faster, larger = higher quality"
335
+ )
336
+
337
+ priority_select = gr.Radio(
338
+ choices=["Most Popular", "Smallest/Fastest", "Best Quality"],
339
+ label="What matters most?",
340
+ value="Most Popular"
341
+ )
342
+
343
+ use_case = gr.Textbox(
344
+ label="Describe your use case (optional)",
345
+ placeholder="e.g., Customer support chatbot for e-commerce"
346
+ )
347
+
348
+ recommend_btn = gr.Button("Get Recommendations", variant="primary", size="lg")
349
+
350
+ with gr.Column(scale=1):
351
+ recommendations = gr.Markdown(label="Recommendations")
352
+ code_example = gr.Markdown(label="Code Example")
353
+
354
+ # Use cases display
355
+ use_cases_display = gr.Markdown(
356
+ value=f"**Common use cases:** {', '.join(TASKS['Text Generation']['use_cases'])}"
357
+ )
358
+
359
+ # Event handlers
360
+ def update_task_info(task):
361
+ desc = f"*{TASKS[task]['description']}*"
362
+ uses = f"**Common use cases:** {', '.join(TASKS[task]['use_cases'])}"
363
+ return desc, uses
364
+
365
+ task_select.change(
366
+ fn=update_task_info,
367
+ inputs=[task_select],
368
+ outputs=[task_description, use_cases_display]
369
+ )
370
+
371
+ recommend_btn.click(
372
+ fn=get_recommendations,
373
+ inputs=[task_select, size_select, priority_select, use_case],
374
+ outputs=[recommendations, code_example]
375
+ )
376
+
377
+ gr.Markdown("""
378
+ ---
379
+
380
+ ### Quick Reference
381
+
382
+ | Task | Best For | Typical Size |
383
+ |------|----------|--------------|
384
+ | Text Generation | Chatbots, content | 3B - 70B |
385
+ | Text Classification | Sentiment, topics | 50M - 300M |
386
+ | Embeddings | Search, RAG | 20M - 100M |
387
+ | Speech Recognition | Transcription | 200M - 1.5B |
388
+ | Image Generation | Art, visualization | 1B - 12B |
389
+
390
+ ---
391
+
392
+ Built by [Lorenzo Scaturchio](https://huggingface.co/gr8monk3ys)
393
+ """)
394
+
395
+
396
+ if __name__ == "__main__":
397
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ gradio>=5.9.1
2
+ huggingface_hub>=0.20.0