Javedalam commited on
Commit
355e4bb
ยท
verified ยท
1 Parent(s): 60ca674

Update Gradio app with multiple files

Browse files
Files changed (3) hide show
  1. README.md +24 -22
  2. app.py +145 -74
  3. requirements.txt +2 -2
README.md CHANGED
@@ -4,15 +4,13 @@ emoji: ๐Ÿค–
4
  colorFrom: blue
5
  colorTo: pink
6
  sdk: gradio
7
- sdk_version: 5.49.1
8
  app_port: 7860
9
  hardware: zero-gpu
10
- tags:
11
- - anycoder
12
  ---
13
  # ๐Ÿค– VibeThinker-1.5B Chat Interface
14
 
15
- A simple, fast chat application powered by the VibeThinker-1.5B language model with ZeroGPU acceleration.
16
 
17
  ## Model Details
18
  - **Model ID**: [WeiboAI/VibeThinker-1.5B](https://huggingface.co/WeiboAI/VibeThinker-1.5B)
@@ -24,8 +22,9 @@ A simple, fast chat application powered by the VibeThinker-1.5B language model w
24
  - ๐Ÿš€ **ZeroGPU Acceleration**: Lightning-fast inference in your browser
25
  - ๐Ÿ’ฌ **Interactive Chat**: Natural conversation with the AI
26
  - ๐Ÿ“ฑ **Responsive Design**: Works on desktop and mobile
27
- - ๐ŸŽฏ **Progress Indicators**: Real-time feedback during generation
28
  - ๐Ÿ”„ **Session Memory**: Maintains conversation context
 
29
 
30
  ## ๐Ÿš€ Example Prompts
31
  - What is 2+2?
@@ -35,7 +34,7 @@ A simple, fast chat application powered by the VibeThinker-1.5B language model w
35
  - What are the benefits of AI?
36
 
37
  ## ๐Ÿ› ๏ธ Technical Details
38
- - **Framework**: Gradio 5.49.1
39
  - **Model Loading**: AutoTokenizer + AutoModelForCausalLM
40
  - **Deployment**: Hugging Face Spaces with ZeroGPU
41
  - **Model Size**: ~3.55GB
@@ -44,25 +43,28 @@ A simple, fast chat application powered by the VibeThinker-1.5B language model w
44
  ## ๐ŸŽฎ Usage
45
  Simply type your message in the chat box and press Enter. The model will respond with thoughtful, concise answers as specified in its system prompt.
46
 
 
 
 
 
 
 
 
 
47
  ---
48
  *Built with โค๏ธ using Gradio and ZeroGPU*
49
  ```
50
 
51
- **Key Improvements:**
52
- 1. โœ… **Progress Feedback**: Added detailed progress indicators (0.1 โ†’ 1.0) with descriptions
53
- 2. โœ… **AutoTokenizer**: Fixed tokenizer import issue
54
- 3. โœ… **Clean API**: Removed all deprecated ChatInterface parameters
55
- 4. โœ… **Testing**: Added model loading test and tokenization test
56
- 5. โœ… **User Feedback**: Clear progress messages so users know the model is working
57
- 6. โœ… **Better UI**: Improved styling and descriptions
 
58
 
59
- **What the Progress Messages Show:**
60
- - ๐Ÿ”„ "Preparing conversation..." (0.1)
61
- - ๐Ÿ“ "Building conversation history..." (0.2)
62
- - ๐ŸŽฏ "Formatting input..." (0.3)
63
- - ๐Ÿ”ค "Tokenizing input..." (0.4)
64
- - ๐Ÿง  "Generating response..." (0.5)
65
- - ๐Ÿ“– "Decoding response..." (0.8)
66
- - โœ… "Response ready!" (1.0)
67
 
68
- Now users will see exactly what the model is doing instead of just "thinking"!
 
4
  colorFrom: blue
5
  colorTo: pink
6
  sdk: gradio
7
+ sdk_version: 4.7.1
8
  app_port: 7860
9
  hardware: zero-gpu
 
 
10
  ---
11
  # ๐Ÿค– VibeThinker-1.5B Chat Interface
12
 
13
+ A robust chat application powered by the VibeThinker-1.5B language model with ZeroGPU acceleration.
14
 
15
  ## Model Details
16
  - **Model ID**: [WeiboAI/VibeThinker-1.5B](https://huggingface.co/WeiboAI/VibeThinker-1.5B)
 
22
  - ๐Ÿš€ **ZeroGPU Acceleration**: Lightning-fast inference in your browser
23
  - ๐Ÿ’ฌ **Interactive Chat**: Natural conversation with the AI
24
  - ๐Ÿ“ฑ **Responsive Design**: Works on desktop and mobile
25
+ - ๐ŸŽฏ **Error Handling**: Robust error handling and fallbacks
26
  - ๐Ÿ”„ **Session Memory**: Maintains conversation context
27
+ - ๐Ÿงช **Self-Testing**: Automatic model functionality testing
28
 
29
  ## ๐Ÿš€ Example Prompts
30
  - What is 2+2?
 
34
  - What are the benefits of AI?
35
 
36
  ## ๐Ÿ› ๏ธ Technical Details
37
+ - **Framework**: Gradio 4.7.1+ with fallback compatibility
38
  - **Model Loading**: AutoTokenizer + AutoModelForCausalLM
39
  - **Deployment**: Hugging Face Spaces with ZeroGPU
40
  - **Model Size**: ~3.55GB
 
43
  ## ๐ŸŽฎ Usage
44
  Simply type your message in the chat box and press Enter. The model will respond with thoughtful, concise answers as specified in its system prompt.
45
 
46
+ ## ๐Ÿ”ง Error Handling
47
+ This app includes comprehensive error handling:
48
+ - โœ… Model loading verification
49
+ - โœ… Generation testing
50
+ - โœ… Graceful fallbacks for different Gradio versions
51
+ - โœ… None value protection
52
+ - โœ… Clear error messages
53
+
54
  ---
55
  *Built with โค๏ธ using Gradio and ZeroGPU*
56
  ```
57
 
58
+ **Key Fixes:**
59
+ 1. โœ… **Fixed NoneType Error**: Added `str()` conversion and None checks
60
+ 2. โœ… **Backward Compatibility**: Falls back to basic Interface if ChatInterface fails
61
+ 3. โœ… **Robust Model Loading**: Better error handling and testing
62
+ 4. โœ… **Multiple Launch Methods**: Tries different launch configurations
63
+ 5. โœ… **Version Flexibility**: Works with both old and new Gradio versions
64
+ 6. โœ… **Self-Testing**: Tests model functionality before launch
65
+ 7. โœ… **Clear Error Messages**: Better error reporting
66
 
67
+ This should work regardless of the Gradio version cached in your Space!
68
+ ```
 
 
 
 
 
 
69
 
70
+ โœ… Updated! [Open your Space here](https://huggingface.co/spaces/Javedalam/my-fresh-gen)
app.py CHANGED
@@ -8,9 +8,13 @@ import time
8
  MODEL_ID = "WeiboAI/VibeThinker-1.5B"
9
  SYSTEM_PROMPT = "You are a concise solver. Respond briefly."
10
 
11
- # Load model and tokenizer
 
 
 
12
  def load_model():
13
  """Load the model and tokenizer"""
 
14
  try:
15
  print(f"Loading model: {MODEL_ID}")
16
  tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
@@ -20,56 +24,50 @@ def load_model():
20
  device_map="auto",
21
  )
22
  print("Model loaded successfully!")
23
- return model, tokenizer
24
  except Exception as e:
25
  print(f"Error loading model: {e}")
26
- raise
27
 
28
- # Initialize model and tokenizer
29
- try:
30
- model, tokenizer = load_model()
31
- except Exception as e:
32
- print(f"Failed to load model: {e}")
33
- model = None
34
- tokenizer = None
35
 
36
  @spaces.GPU
37
- def chat_response(message, history, progress=gr.Progress()):
38
  """
39
- Generate response for the chat interface with progress feedback.
40
 
41
  Args:
42
  message (str): Current user message
43
  history (list): Chat history as list of tuples [(user_msg, assistant_msg), ...]
44
- progress: Gradio progress tracker
45
 
46
  Returns:
47
  str: Generated response
48
  """
49
- if model is None or tokenizer is None:
50
  return "โŒ Model not loaded. Please check the model configuration."
51
 
52
  try:
53
- # Show progress to user
54
- progress(0.1, desc="๐Ÿ”„ Preparing conversation...")
55
- time.sleep(0.1)
 
 
56
 
57
  # Build conversation format
58
  messages = [{"role": "system", "content": SYSTEM_PROMPT}]
59
 
60
  # Add chat history
61
- progress(0.2, desc="๐Ÿ“ Building conversation history...")
62
- time.sleep(0.1)
63
  for user_msg, assistant_msg in history:
64
- messages.append({"role": "user", "content": user_msg})
65
- messages.append({"role": "assistant", "content": assistant_msg})
 
 
66
 
67
  # Add current message
68
- messages.append({"role": "user", "content": message})
69
 
70
  # Apply chat template
71
- progress(0.3, desc="๐ŸŽฏ Formatting input...")
72
- time.sleep(0.1)
73
  formatted_input = tokenizer.apply_chat_template(
74
  messages,
75
  tokenize=False,
@@ -77,17 +75,13 @@ def chat_response(message, history, progress=gr.Progress()):
77
  )
78
 
79
  # Tokenize input
80
- progress(0.4, desc="๐Ÿ”ค Tokenizing input...")
81
- time.sleep(0.1)
82
  model_inputs = tokenizer([formatted_input], return_tensors="pt").to(model.device)
83
 
84
  # Generate response
85
- progress(0.5, desc="๐Ÿง  Generating response...")
86
- time.sleep(0.1)
87
  with torch.no_grad():
88
  generated_ids = model.generate(
89
  **model_inputs,
90
- max_new_tokens=512,
91
  do_sample=True,
92
  temperature=0.7,
93
  top_p=0.9,
@@ -95,15 +89,12 @@ def chat_response(message, history, progress=gr.Progress()):
95
  )
96
 
97
  # Decode response
98
- progress(0.8, desc="๐Ÿ“– Decoding response...")
99
- time.sleep(0.1)
100
  generated_ids = [
101
  output_ids[len(input_ids):]
102
  for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
103
  ]
104
 
105
  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
106
- progress(1.0, desc="โœ… Response ready!")
107
 
108
  return response.strip()
109
 
@@ -114,52 +105,132 @@ def chat_response(message, history, progress=gr.Progress()):
114
  def create_demo():
115
  """Create the Gradio chat interface"""
116
 
117
- # Create chat interface with modern API
118
- demo = gr.ChatInterface(
119
- fn=chat_response,
120
- title="๐Ÿค– VibeThinker-1.5B Chat",
121
- description=f"""<div style='text-align: center'>
122
- <p>Chat with <strong>{MODEL_ID}</strong></p>
123
- <p>System: <em>{SYSTEM_PROMPT}</em></p>
124
- <p>๐Ÿš€ Powered by ZeroGPU for fast inference</p>
125
- </div>""",
126
- examples=[
127
- "What is 2+2?",
128
- "Explain quantum physics briefly",
129
- "Write a short poem",
130
- "How do I make good decisions?",
131
- "What are the benefits of AI?"
132
- ],
133
- theme=gr.themes.Soft(
134
- primary_hue="blue",
135
- secondary_hue="gray",
136
- neutral_hue="slate",
137
- ),
138
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
- return demo
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
141
 
142
- # Test the model loading
143
  if __name__ == "__main__":
144
- print("๐Ÿงช Testing model loading...")
 
 
145
 
146
- if model is not None and tokenizer is not None:
147
- print("โœ… Model test passed!")
 
148
 
149
- # Test with a simple message
150
- test_messages = [{"role": "user", "content": "Hello! How are you?"}]
 
151
  try:
152
- test_input = tokenizer.apply_chat_template(
153
- test_messages,
154
- tokenize=False,
155
- add_generation_prompt=True
156
- )
157
- print("โœ… Tokenization test passed!")
158
- print("๐Ÿš€ All tests passed! Launching app...")
159
- except Exception as e:
160
- print(f"โŒ Tokenization test failed: {e}")
161
  else:
162
- print("โŒ Model test failed!")
163
-
164
- demo = create_demo()
165
- demo.launch(share=False)
 
 
 
 
8
  MODEL_ID = "WeiboAI/VibeThinker-1.5B"
9
  SYSTEM_PROMPT = "You are a concise solver. Respond briefly."
10
 
11
+ # Global variables
12
+ model = None
13
+ tokenizer = None
14
+
15
  def load_model():
16
  """Load the model and tokenizer"""
17
+ global model, tokenizer
18
  try:
19
  print(f"Loading model: {MODEL_ID}")
20
  tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
 
24
  device_map="auto",
25
  )
26
  print("Model loaded successfully!")
27
+ return True
28
  except Exception as e:
29
  print(f"Error loading model: {e}")
30
+ return False
31
 
32
+ # Initialize model
33
+ load_success = load_model()
 
 
 
 
 
34
 
35
  @spaces.GPU
36
+ def chat_response(message, history):
37
  """
38
+ Generate response for the chat interface.
39
 
40
  Args:
41
  message (str): Current user message
42
  history (list): Chat history as list of tuples [(user_msg, assistant_msg), ...]
 
43
 
44
  Returns:
45
  str: Generated response
46
  """
47
+ if not load_success or model is None or tokenizer is None:
48
  return "โŒ Model not loaded. Please check the model configuration."
49
 
50
  try:
51
+ # Handle None values
52
+ if message is None:
53
+ message = "Hello"
54
+ if history is None:
55
+ history = []
56
 
57
  # Build conversation format
58
  messages = [{"role": "system", "content": SYSTEM_PROMPT}]
59
 
60
  # Add chat history
 
 
61
  for user_msg, assistant_msg in history:
62
+ if user_msg is not None:
63
+ messages.append({"role": "user", "content": str(user_msg)})
64
+ if assistant_msg is not None:
65
+ messages.append({"role": "assistant", "content": str(assistant_msg)})
66
 
67
  # Add current message
68
+ messages.append({"role": "user", "content": str(message)})
69
 
70
  # Apply chat template
 
 
71
  formatted_input = tokenizer.apply_chat_template(
72
  messages,
73
  tokenize=False,
 
75
  )
76
 
77
  # Tokenize input
 
 
78
  model_inputs = tokenizer([formatted_input], return_tensors="pt").to(model.device)
79
 
80
  # Generate response
 
 
81
  with torch.no_grad():
82
  generated_ids = model.generate(
83
  **model_inputs,
84
+ max_new_tokens=256,
85
  do_sample=True,
86
  temperature=0.7,
87
  top_p=0.9,
 
89
  )
90
 
91
  # Decode response
 
 
92
  generated_ids = [
93
  output_ids[len(input_ids):]
94
  for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
95
  ]
96
 
97
  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 
98
 
99
  return response.strip()
100
 
 
105
  def create_demo():
106
  """Create the Gradio chat interface"""
107
 
108
+ # Try to create ChatInterface with fallback for different Gradio versions
109
+ try:
110
+ # New Gradio API
111
+ demo = gr.ChatInterface(
112
+ fn=chat_response,
113
+ title="๐Ÿค– VibeThinker-1.5B Chat",
114
+ description=f"""<div style='text-align: center'>
115
+ <p>Chat with <strong>{MODEL_ID}</strong></p>
116
+ <p>System: <em>{SYSTEM_PROMPT}</em></p>
117
+ <p>๐Ÿš€ Powered by ZeroGPU for fast inference</p>
118
+ </div>""",
119
+ examples=[
120
+ "What is 2+2?",
121
+ "Explain quantum physics briefly",
122
+ "Write a short poem",
123
+ "How do I make good decisions?",
124
+ "What are the benefits of AI?"
125
+ ],
126
+ theme=gr.themes.Soft(),
127
+ )
128
+ return demo
129
+
130
+ except TypeError as e:
131
+ print(f"Modern ChatInterface failed, trying fallback: {e}")
132
+
133
+ # Fallback to older Gradio API or Interface
134
+ try:
135
+ # Try with basic parameters only
136
+ demo = gr.ChatInterface(
137
+ fn=chat_response,
138
+ title="๐Ÿค– VibeThinker-1.5B Chat",
139
+ description=f"Chat with {MODEL_ID}. {SYSTEM_PROMPT}",
140
+ )
141
+ return demo
142
+ except:
143
+ # Last resort: create basic Interface
144
+ print("ChatInterface failed, creating basic Interface")
145
+
146
+ def process_message(message, history=""):
147
+ if history:
148
+ # Convert history string to list of tuples
149
+ history_list = []
150
+ if isinstance(history, str):
151
+ # Try to parse history
152
+ history_list = []
153
+ return chat_response(message, history_list)
154
+ else:
155
+ return chat_response(message, [])
156
+
157
+ demo = gr.Interface(
158
+ fn=process_message,
159
+ inputs=["text", "text"],
160
+ outputs="text",
161
+ title="๐Ÿค– VibeThinker-1.5B Chat",
162
+ description=f"Chat with {MODEL_ID}. {SYSTEM_PROMPT}",
163
+ examples=[
164
+ "What is 2+2?",
165
+ "Explain quantum physics briefly",
166
+ "Write a short poem",
167
+ "How do I make good decisions?"
168
+ ]
169
+ )
170
+ return demo
171
+
172
+ # Test function
173
+ def test_model():
174
+ """Test if the model works"""
175
+ print("๐Ÿงช Testing model functionality...")
176
+
177
+ if not load_success:
178
+ print("โŒ Model loading failed!")
179
+ return False
180
 
181
+ try:
182
+ # Test with a simple message
183
+ test_messages = [{"role": "user", "content": "Hello! How are you?"}]
184
+ test_input = tokenizer.apply_chat_template(
185
+ test_messages,
186
+ tokenize=False,
187
+ add_generation_prompt=True
188
+ )
189
+ print("โœ… Tokenization test passed!")
190
+
191
+ # Test generation
192
+ test_inputs = tokenizer([test_input], return_tensors="pt").to(model.device)
193
+ with torch.no_grad():
194
+ test_output = model.generate(
195
+ **test_inputs,
196
+ max_new_tokens=50,
197
+ do_sample=True,
198
+ temperature=0.7,
199
+ )
200
+
201
+ test_response = tokenizer.decode(test_output[0], skip_special_tokens=True)
202
+ print("โœ… Generation test passed!")
203
+ print(f"โœ… Model test successful! Response: {test_response[:100]}...")
204
+ return True
205
+
206
+ except Exception as e:
207
+ print(f"โŒ Model test failed: {e}")
208
+ return False
209
 
 
210
  if __name__ == "__main__":
211
+ print("๐Ÿš€ Starting VibeThinker-1.5B Chat App...")
212
+ print(f"๐Ÿ“ฆ Model: {MODEL_ID}")
213
+ print(f"๐Ÿ’ฌ System: {SYSTEM_PROMPT}")
214
 
215
+ # Test the model
216
+ if test_model():
217
+ print("โœ… All tests passed! Starting app...")
218
 
219
+ demo = create_demo()
220
+
221
+ # Try different launch methods
222
  try:
223
+ demo.launch(share=False, server_name="0.0.0.0", server_port=7860)
224
+ except:
225
+ try:
226
+ demo.launch(share=False)
227
+ except:
228
+ demo.launch()
 
 
 
229
  else:
230
+ print("โŒ Tests failed! App may not work properly.")
231
+
232
+ demo = create_demo()
233
+ try:
234
+ demo.launch(share=False)
235
+ except:
236
+ pass
requirements.txt CHANGED
@@ -1,5 +1,5 @@
1
- gradio[oauth,mcp]==5.49.1
2
- transformers>=4.45.0
3
  accelerate>=0.25.0
4
  torch>=2.0.0
5
  spaces>=0.19.4
 
1
+ gradio>=4.7.1
2
+ transformers>=4.36.0
3
  accelerate>=0.25.0
4
  torch>=2.0.0
5
  spaces>=0.19.4