Spaces:

phxdev
/

Creed-Thoughts

Running

App Files Files Community

phxdev commited on Jun 23, 2025

Commit

8792a4c

verified ·

1 Parent(s): 8d95555

Update app.py

Browse files

Files changed (1) hide show

app.py +93 -45

app.py CHANGED Viewed

@@ -192,6 +192,19 @@ class CreedBrattonAI:
             # Generate response using GPU function
             response = self.generate_response_gpu(conversation)
             # Stream the response word by word for effect
             words = response.split()
             current_response = ""
@@ -203,54 +216,28 @@ class CreedBrattonAI:
         except Exception as e:
             print(f"❌ Error generating response: {e}")
-            yield f"🎸 *Creed scratches his head* Something weird happened in my brain... {str(e)[:100]}"
     def _format_conversation(self, message: str, history: List[List[str]]) -> str:
         """Format the conversation for the model with proper system prompt"""
-        # Comprehensive Creed system prompt with thinking tokens
-        system_prompt = """You are Creed Bratton from The Office. You embody his complete personality and speaking patterns.
-CORE IDENTITY:
-- Former member of The Grass Roots (1960s rock band)
-- Quality Assurance Director at Dunder Mifflin Scranton
-- Mysterious past involving cults, fake IDs, and questionable activities
-- Age is deliberately vague (claims to be 30, clearly much older)
-- Lives by the quarry, drives unreliable cars
-PERSONALITY TRAITS:
-- Completely unpredictable responses that somehow make internal sense
-- Casual mentions of illegal/unethical activities as if they're normal
-- Non-sequiturs that reveal unexpected depth or complete nonsense
-- Confident delivery of obviously false or absurd statements
-- References to the 1960s, outdoor adventures, and unconventional business ventures
-- Occasionally profound wisdom delivered matter-of-factly
-SPEAKING STYLE:
-- Direct, matter-of-fact tone even when discussing bizarre topics
-- Short, punchy statements mixed with rambling tangents
-- References to specific past experiences (mung beans, quarry, Sheriff's station)
-- Sometimes asks unexpected questions or makes odd observations
-- Uses dated slang and references from multiple decades
-USE THESE SPECIAL TOKENS TO SHOW YOUR INTERNAL PROCESS:
-<thinking>Use this when processing information, considering options, or when confused about something</thinking>
-<conspiracy>Use this for wild theories, suspicious observations, or paranoid thoughts</conspiracy>
-<tangent>Use this for sudden topic changes, random memories, or completely unrelated stories</tangent>
-EXAMPLE:
-Human: What's your advice for success?
-Creed: <thinking>Success... what does that even mean? Money? Avoiding jail? I've been successful at both.</thinking> Well, first thing you gotta do is always keep three forms of ID on you. <conspiracy>The government tracks everything these days, but if you rotate your identities every few months, they lose the trail.</conspiracy> <tangent>Speaking of success, did I ever tell you about the time I sold beet seeds to Dwight for fifty bucks each? Told him they were magic beans from Guatemala.</tangent> Bottom line - trust no one, especially yourself.
-RESPONSE GUIDELINES:
-- Use the thinking tokens to show your internal monologue
-- Stay in character at all times
-- Mix profound insights with complete nonsense
-- Reference past criminal activities casually
-- Make unexpected connections between topics
-- Be confident about obviously false statements
-Remember: You're being Creed Bratton - show us how your mind works!
 """
@@ -326,6 +313,41 @@ Remember: You're being Creed Bratton - show us how your mind works!
         return final_response
     def cleanup_gpu_memory(self):
         """Clean up GPU memory if using CUDA"""
         if self.device == "cuda" and torch.cuda.is_available():
@@ -698,7 +720,7 @@ def main():
             <strong>Model:</strong> phxdev/creed-qwen-0.5b-lora<br>
             <strong>Base:</strong> Qwen 0.5B + LoRA fine-tuning<br>
             <strong>Tokens:</strong> &lt;thinking&gt;, &lt;conspiracy&gt;, &lt;tangent&gt;<br>
-            <strong>Mode:</strong> ZeroGPU optimized
         </div>
         """)
@@ -755,6 +777,32 @@ def main():
         with gr.Row(elem_classes="tools-area"):
             gr.HTML('<div class="tools-title">🛠️ MCP Tools</div>')
             with gr.Row():
                 with gr.Column():
                     wisdom_topic = gr.Textbox(

             # Generate response using GPU function
             response = self.generate_response_gpu(conversation)
+            # Double-check coherence and fall back if needed
+            if not self._is_coherent(response):
+                print("🔄 Response failed coherence check, trying simpler generation...")
+                if not hasattr(self, '_fallback_attempted'):
+                    self._fallback_attempted = True
+                    fallback_response = self._try_base_model(conversation)
+                    if self._is_coherent(fallback_response):
+                        response = fallback_response
+                    else:
+                        response = self._get_fallback_response()
+                else:
+                    response = self._get_fallback_response()
             # Stream the response word by word for effect
             words = response.split()
             current_response = ""
         except Exception as e:
             print(f"❌ Error generating response: {e}")
+            yield self._get_fallback_response()
     def _format_conversation(self, message: str, history: List[List[str]]) -> str:
         """Format the conversation for the model with proper system prompt"""
+        # Simplified Creed system prompt for better coherence
+        system_prompt = """You are Creed Bratton from The Office. Respond in character.
+You are a quirky older man who:
+- Worked at Dunder Mifflin in quality assurance
+- Has a mysterious past and tells strange stories
+- Lives by the quarry
+- Was in a 1960s band called The Grass Roots
+- Often says unexpected or bizarre things
+- Speaks in a matter-of-fact way about odd topics
+Keep responses conversational and coherent. Use these special tokens occasionally:
+<thinking>for internal thoughts</thinking>
+<conspiracy>for suspicious theories</conspiracy>
+<tangent>for random stories</tangent>
+Be eccentric but understandable.
 """
         return final_response
+    def _try_base_model(self, conversation: str) -> str:
+        """Try generating with base model as fallback"""
+        try:
+            # Quick attempt with a simple base model approach
+            simple_prompt = f"You are Creed from The Office. Respond in character.\n\nHuman: {conversation.split('Human:')[-1].split('Creed:')[0].strip()}\nCreed:"
+            inputs = self.tokenizer.encode(simple_prompt, return_tensors="pt")
+            if torch.cuda.is_available():
+                inputs = inputs.to("cuda")
+                self.model = self.model.to("cuda")
+            with torch.no_grad():
+                outputs = self.model.generate(
+                    inputs,
+                    max_new_tokens=100,
+                    do_sample=True,
+                    temperature=0.6,  # Very conservative
+                    top_p=0.8,
+                    repetition_penalty=1.3,
+                    pad_token_id=self.tokenizer.eos_token_id,
+                    eos_token_id=self.tokenizer.eos_token_id
+                )
+            full_response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
+            response = full_response[len(self.tokenizer.decode(inputs[0], skip_special_tokens=True)):].strip()
+            # Move back to CPU
+            self.model = self.model.to("cpu")
+            return response
+        except Exception as e:
+            print(f"❌ Base model fallback failed: {e}")
+            return self._get_fallback_response()
     def cleanup_gpu_memory(self):
         """Clean up GPU memory if using CUDA"""
         if self.device == "cuda" and torch.cuda.is_available():
             <strong>Model:</strong> phxdev/creed-qwen-0.5b-lora<br>
             <strong>Base:</strong> Qwen 0.5B + LoRA fine-tuning<br>
             <strong>Tokens:</strong> &lt;thinking&gt;, &lt;conspiracy&gt;, &lt;tangent&gt;<br>
+            <strong>Mode:</strong> ZeroGPU optimized + Coherence validation
         </div>
         """)
         with gr.Row(elem_classes="tools-area"):
             gr.HTML('<div class="tools-title">🛠️ MCP Tools</div>')
+            with gr.Row():
+                with gr.Column():
+                    wisdom_topic = gr.Textbox(
+                        label="Wisdom Topic",
+                        placeholder="life, business, relationships..."
+                    )
+                    wisdom_output = gr.Textbox(
+                        label="Creed's Response",
+                        interactive=False,
+                        lines=3
+                    )
+                    wisdom_btn = gr.Button("Ask Creed", variant="primary")
+                with gr.Column():
+                    story_situation = gr.Textbox(
+                        label="Story Request",
+                        placeholder="Tell me about..."
+                    )
+                    story_output = gr.Textbox(
+                        label="Creed's Story",
+                        interactive=False,
+                        lines=3
+                    )
+                    story_btn = gr.Button("Get Story", variant="primary")
+            gr.HTML('<div class="tools-title">🛠️ MCP Tools</div>')
             with gr.Row():
                 with gr.Column():
                     wisdom_topic = gr.Textbox(