Spaces:

jmisak
/

ProjectEcho

Sleeping

App Files Files Community

jmisak commited on Oct 25, 2025

Commit

1f1921e

verified ·

1 Parent(s): d4abd8e

Upload 4 files

Browse files

Files changed (4) hide show

CHANGELOG.md +11 -9
FREE_MODELS.md +34 -25
README.md +14 -13
llm_backend.py +2 -2

CHANGELOG.md CHANGED Viewed

@@ -5,11 +5,12 @@ All notable changes to ConversAI will be documented in this file.
 ## [1.1.0] - 2025-11-XX
 ### Changed
-- **✨ NEW DEFAULT MODEL**: Switched to Microsoft Phi-3-mini-4k-instruct
-  - Faster, more reliable on HuggingFace free tier
-  - Better quality than previous default (Mixtral-8x7B)
-  - Smaller model = less latency on free tier
   - **100% free and ungated** - no approvals needed
 - **🆓 FOCUS ON FREE MODELS**: Completely revised to use only free, ungated models
   - Removed paid API recommendations (OpenAI, Anthropic)
@@ -24,11 +25,11 @@ All notable changes to ConversAI will be documented in this file.
   - Performance benchmarks
   - Troubleshooting tips
-- Alternative free model options:
   - google/flan-t5-xxl (very fast)
-  - mistralai/Mistral-7B-Instruct-v0.2 (best quality)
   - google/flan-t5-xl (maximum speed)
-  - google/flan-ul2 (long contexts)
 ### Fixed
 - Optimized for HuggingFace free tier reliability
@@ -37,8 +38,9 @@ All notable changes to ConversAI will be documented in this file.
 ### Technical Details
 - Default model changed in `llm_backend.py` line 69
-- From: `mistralai/Mixtral-8x7B-Instruct-v0.1`
-- To: `microsoft/Phi-3-mini-4k-instruct`
 ---

 ## [1.1.0] - 2025-11-XX
 ### Changed
+- **✨ NEW DEFAULT MODEL**: Switched to Mistral-7B-Instruct-v0.2
+  - **Verified working** on HuggingFace Inference API
+  - Excellent quality for professional survey work
+  - Actively deployed and maintained
   - **100% free and ungated** - no approvals needed
+  - Previous model (Phi-3) not deployed on Inference API
 - **🆓 FOCUS ON FREE MODELS**: Completely revised to use only free, ungated models
   - Removed paid API recommendations (OpenAI, Anthropic)
   - Performance benchmarks
   - Troubleshooting tips
+- Alternative free model options (verified deployed):
   - google/flan-t5-xxl (very fast)
   - google/flan-t5-xl (maximum speed)
+  - meta-llama/Llama-2-7b-chat-hf (alternative)
+  - **Note**: Only use models verified as "Deployed" on HF Inference API
 ### Fixed
 - Optimized for HuggingFace free tier reliability
 ### Technical Details
 - Default model changed in `llm_backend.py` line 69
+- From: `mistralai/Mixtral-8x7B-Instruct-v0.1` (not deployed)
+- To: `mistralai/Mistral-7B-Instruct-v0.2` (verified deployed)
+- Reason: Phi-3 initially chosen but not available on Inference API
 ---

FREE_MODELS.md CHANGED Viewed

@@ -4,11 +4,15 @@
 ---
 ## ✨ TL;DR
-**Default model (Phi-3) works great!** Just deploy and use. No configuration needed.
-Want to try others? Set `LLM_MODEL` environment variable to any model below.
 ---
@@ -19,34 +23,38 @@ All models below are:
 - ✅ **Ungated** - No approval needed
 - ✅ **Works on HuggingFace Spaces** - Ready to use
-### 1. Microsoft Phi-3-mini-4k-instruct ⭐ (DEFAULT)
-**Best for:** General use, balanced performance
 ```bash
-LLM_MODEL=microsoft/Phi-3-mini-4k-instruct
 ```
 **Specs:**
-- Speed: ⚡⚡ Fast (10-30 seconds)
-- Quality: ⭐⭐⭐ Good
-- Size: 3.8B parameters (small, efficient)
-- Context: 4K tokens
 **Pros:**
-- Fast and reliable
-- Good at following instructions
-- Low latency on free tier
-- Balanced quality/speed
 **Cons:**
-- May struggle with very complex analysis
-- Limited context window (4K)
 **Best for:**
-- Survey generation (5-15 questions)
-- Quick translations (1-3 languages)
-- Basic analysis (20-50 responses)
 ---
@@ -182,13 +190,14 @@ LLM_MODEL=google/flan-ul2
 ## 📊 Model Comparison
-| Model | Speed | Quality | Size | Best Use Case |
-|-------|-------|---------|------|---------------|
-| **Phi-3-mini** ⭐ | ⚡⚡ Fast | ⭐⭐⭐ Good | 3.8B | **Default - balanced** |
-| **Flan-T5-XXL** | ⚡⚡⚡ Very Fast | ⭐⭐ Decent | 11B | **Speed priority** |
-| **Mistral-7B** | ⚡ Slow | ⭐⭐⭐⭐ Excellent | 7B | **Quality priority** |
-| **Flan-T5-XL** | ⚡⚡⚡ Very Fast | ⭐⭐ Decent | 3B | **Maximum speed** |
-| **Flan-UL2** | ⚡⚡ Fast | ⭐⭐⭐ Good | 20B | **Long contexts** |
 ---

 ---
+> **⚠️ IMPORTANT:** Only models marked as "✅ Deployed" are actively available on HuggingFace Inference API. Others may return 404 errors. **Default (Mistral-7B) is verified working.**
+---
 ## ✨ TL;DR
+**Default model (Mistral-7B) works great!** Just deploy and use. No configuration needed.
+Want to try others? Set `LLM_MODEL` environment variable to any verified model below.
 ---
 - ✅ **Ungated** - No approval needed
 - ✅ **Works on HuggingFace Spaces** - Ready to use
+### 1. Mistral-7B-Instruct-v0.2 ⭐ (DEFAULT)
+**Best for:** General use, best quality on free tier
 ```bash
+LLM_MODEL=mistralai/Mistral-7B-Instruct-v0.2
 ```
 **Specs:**
+- Speed: ⚡⚡ Medium (20-45 seconds)
+- Quality: ⭐⭐⭐⭐ Excellent
+- Size: 7B parameters
+- Context: 8K tokens
+- Status: ✅ **Actively deployed on HF Inference API**
 **Pros:**
+- **Best quality among free ungated models**
+- Excellent instruction following
+- Good reasoning capabilities
+- Handles complex tasks well
+- Actively maintained and deployed
 **Cons:**
+- Slower than smaller models
+- May queue during peak times
+- First request can take 60+ seconds (cold start)
 **Best for:**
+- Professional survey generation
+- High-quality translations
+- Detailed analysis (50+ responses)
+- When quality matters most
 ---
 ## 📊 Model Comparison
+| Model | Speed | Quality | Size | Deployed | Best Use Case |
+|-------|-------|---------|------|----------|---------------|
+| **Mistral-7B** ⭐ | ⚡⚡ Medium | ⭐⭐⭐⭐ Excellent | 7B | ✅ Yes | **Default - best quality** |
+| **Flan-T5-XXL** | ⚡⚡⚡ Very Fast | ⭐⭐ Decent | 11B | ✅ Yes | **Speed priority** |
+| **Flan-T5-XL** | ⚡⚡⚡ Very Fast | ⭐⭐ Decent | 3B | ✅ Yes | **Maximum speed** |
+| **Llama-2-7b-chat** | ⚡⚡ Medium | ⭐⭐⭐ Good | 7B | ✅ Yes | **Alternative option** |
+**Note:** Only models with "✅ Yes" in Deployed column are currently available on HF Inference API.
 ---

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ Battle the blank page, reach global audiences, and uncover insights with AI assi
 ---
-> **✨ NEW (Nov 2025):** Now uses **Microsoft Phi-3** - Faster, reliable, and **completely FREE** on HuggingFace!
 ---
@@ -57,12 +57,12 @@ Battle the blank page, reach global audiences, and uncover insights with AI assi
 **✨ Zero configuration needed!** ConversAI works out-of-the-box on HuggingFace Spaces.
-**Default Model:** Microsoft Phi-3-mini-4k-instruct
 - ✅ **100% Free** - No API keys, no costs, ever
-- ✅ **Fast** - Optimized for speed (10-30 seconds)
 - ✅ **Ungated** - No approval needed, works immediately
-- ✅ **Good Quality** - Suitable for professional survey work
-- ✅ **Reliable** - Stable on HuggingFace Inference API
 **Setup for PUBLIC Spaces (Recommended):**
 - Just deploy - uses built-in `HF_TOKEN` automatically
@@ -80,15 +80,16 @@ Battle the blank page, reach global audiences, and uncover insights with AI assi
 You can try different free models by setting the `LLM_MODEL` environment variable:
-**Recommended Free Models:**
-| Model | Best For | Speed | Quality | Ungated |
-|-------|----------|-------|---------|---------|
-| **microsoft/Phi-3-mini-4k-instruct** (default) | General use, balanced | ⚡⚡ Fast | ⭐⭐⭐ Good | ✅ Yes |
-| **google/flan-t5-xxl** | Fast responses, instructions | ⚡⚡⚡ Very Fast | ⭐⭐ Decent | ✅ Yes |
-| **mistralai/Mistral-7B-Instruct-v0.2** | Best quality (slower) | ⚡ Slower | ⭐⭐⭐⭐ Excellent | ✅ Yes |
-| **google/flan-t5-xl** | Maximum speed | ⚡⚡⚡ Very Fast | ⭐⭐ Decent | ✅ Yes |
-| **google/flan-ul2** | Long contexts | ⚡⚡ Fast | ⭐⭐⭐ Good | ✅ Yes |
 **To change model:**
 ```bash

 ---
+> **✨ UPDATED (Nov 2025):** Now uses **Mistral-7B-Instruct** - High quality, reliable, and **completely FREE** on HuggingFace!
 ---
 **✨ Zero configuration needed!** ConversAI works out-of-the-box on HuggingFace Spaces.
+**Default Model:** Mistral-7B-Instruct-v0.2
 - ✅ **100% Free** - No API keys, no costs, ever
+- ✅ **High Quality** - Excellent output for professional work (20-45 seconds)
 - ✅ **Ungated** - No approval needed, works immediately
+- ✅ **Proven** - Popular model, stable on HuggingFace Inference API
+- ✅ **Reliable** - Actively deployed and maintained
 **Setup for PUBLIC Spaces (Recommended):**
 - Just deploy - uses built-in `HF_TOKEN` automatically
 You can try different free models by setting the `LLM_MODEL` environment variable:
+**Recommended Free Models (Verified on HF Inference API):**
+| Model | Best For | Speed | Quality | Status |
+|-------|----------|-------|---------|--------|
+| **mistralai/Mistral-7B-Instruct-v0.2** (default) | Best quality, general use | ⚡⚡ Medium | ⭐⭐⭐⭐ Excellent | ✅ Deployed |
+| **google/flan-t5-xxl** | Fast responses | ⚡⚡⚡ Very Fast | ⭐⭐ Decent | ✅ Deployed |
+| **google/flan-t5-xl** | Maximum speed | ⚡⚡⚡ Very Fast | ⭐⭐ Decent | ✅ Deployed |
+| **meta-llama/Llama-2-7b-chat-hf** | Alternative quality | ⚡⚡ Medium | ⭐⭐⭐ Good | ✅ Deployed |
+**Note:** Only use models marked as "Deployed" - others may not be available on the free Inference API.
 **To change model:**
 ```bash

llm_backend.py CHANGED Viewed

@@ -65,8 +65,8 @@ class LLMBackend:
         defaults = {
             LLMProvider.OPENAI: "gpt-4o-mini",
             LLMProvider.ANTHROPIC: "claude-3-5-sonnet-20241022",
-            # Using Phi-3 - smaller, faster, free, ungated
-            LLMProvider.HUGGINGFACE: "microsoft/Phi-3-mini-4k-instruct",
             LLMProvider.LM_STUDIO: "google/gemma-3-27b"
         }
         return os.getenv("LLM_MODEL", defaults[self.provider])

         defaults = {
             LLMProvider.OPENAI: "gpt-4o-mini",
             LLMProvider.ANTHROPIC: "claude-3-5-sonnet-20241022",
+            # Using Mistral-7B - proven to work on HF Inference API, free, ungated
+            LLMProvider.HUGGINGFACE: "mistralai/Mistral-7B-Instruct-v0.2",
             LLMProvider.LM_STUDIO: "google/gemma-3-27b"
         }
         return os.getenv("LLM_MODEL", defaults[self.provider])