Spaces:

AdamTT
/

Model_Fit_Finder

Sleeping

App Files Files Community

AdamTT commited on 28 days ago

Commit

9d9b8ca

verified ·

1 Parent(s): de3dea2

Update README.md

Browse files

Files changed (1) hide show

README.md +157 -1

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 title: Model Fit Finder
-emoji: 📉
 colorFrom: red
 colorTo: red
 sdk: gradio
@@ -8,6 +8,162 @@ sdk_version: 6.4.0
 app_file: app.py
 pinned: false
 license: apache-2.0
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Model Fit Finder
+emoji: 👀
 colorFrom: red
 colorTo: red
 sdk: gradio
 app_file: app.py
 pinned: false
 license: apache-2.0
+short_description: Space that helps you choose the right type of NLP model
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+# Model Fit Finder (CPU)
+**Model Fit Finder** is a decision-support Space that helps you choose the **right type of NLP model** and **concrete Hugging Face models** for your task — without training, without GPU, and without guesswork.
+The Space is designed to reflect real-world AI engineering decisions rather than showcase a single model demo.
+---
+## What this Space does
+The Space guides the user through a small set of practical questions and then:
+* identifies the **appropriate model category** (instruction, QA, embeddings),
+* ranks and recommends **at least 3 concrete Hugging Face models**,
+* explains **why these models were selected**,
+* adapts recommendations based on **language, compute budget, and priority**,
+* optionally pulls **up-to-date models directly from Hugging Face Hub**.
+All recommendations are **CPU-friendly** and suitable for lightweight prototyping and production planning.
+---
+## Supported NLP tasks
+The Space currently supports three common NLP problem types:
+### 1. Chat / instruction-following / generation
+For tasks such as:
+* chatbots
+* summarization
+* explanation
+* instruction-based text processing
+Recommended models are **instruction-tuned text-to-text or generative models**.
+---
+### 2. Question Answering from documents (extractive QA)
+For tasks where:
+* you have a document or text,
+* answers must come strictly from that text,
+* hallucinations should be minimized.
+Recommended models are **extractive QA models** fine-tuned on datasets like SQuAD.
+---
+### 3. Semantic similarity / search / deduplication
+For tasks such as:
+* finding semantically similar texts,
+* detecting near-duplicates,
+* semantic search,
+* retrieval for RAG pipelines.
+Recommended models are **embedding (sentence similarity) models**.
+---
+## How recommendations are generated
+Recommendations are **not static**. The Space uses a simple but explicit decision logic based on:
+* **Data language**
+  * EN
+  * PL
+  * Mixed / multilingual
+* **Compute budget**
+  * Low (fast, small models)
+  * Medium (allows larger, higher-quality models)
+* **Priority**
+  * Speed
+  * Quality
+* **Model source**
+  * Curated (hand-picked, stable baseline)
+  * HF Live (fresh models from Hugging Face Hub)
+  * Hybrid (curated + live)
+Each candidate model is scored using heuristics such as:
+* model size (small vs base),
+* language coverage (English vs multilingual),
+* suitability for the selected budget and priority,
+* stability (curated vs live).
+The Space always returns **a minimum of three models**.
+---
+## Hugging Face Live mode
+When **HF Live** or **Hybrid** mode is enabled, the Space:
+* queries the Hugging Face Hub using task-specific pipeline tags,
+* ranks models by popularity (downloads),
+* applies language and budget heuristics,
+* caches results locally (with TTL),
+* allows manual refresh via a **“Refresh HF cache”** button.
+This prevents the Space from becoming outdated while keeping results stable and interpretable.
+---
+## What this Space is (and is not)
+**This Space is:**
+* a model selection assistant,
+* a practical decision tool,
+* CPU-only and cost-free,
+* suitable for engineers, analysts, and ML practitioners.
+**This Space is not:**
+* a chatbot demo,
+* a benchmark leaderboard,
+* an automatic “best model” oracle.
+Its goal is to help you make **better-informed model choices**, not to hide trade-offs.
+---
+## Example use cases
+* *“Which embedding model should I use to detect semantically similar Revit Key Notes?”*
+* *“I have a policy document and want reliable question answering without hallucinations.”*
+* *“I need a lightweight instruction-following model for short summaries on CPU.”*
+* *“Which models make sense for Polish or mixed-language text?”*
+---
+## Technical notes
+* No model training is performed.
+* No GPU is required.
+* All logic runs on CPU.
+* Model recommendations are based on metadata, heuristics, and Hugging Face Hub signals.
+---
+## Why this Space exists
+Choosing the right model is often harder than using one.
+This Space focuses on **model selection reasoning** — the part that usually lives only in engineers’ heads — and makes it explicit, inspectable, and reusable.