Spaces:

Ina-Shapiro
/

Paperbot6

Sleeping

App Files Files Community

Ina-Shapiro commited on Aug 3, 2025

Commit

028ef27

1 Parent(s): a213258

Refactor app.py to enhance paper fetching functionality and improve error handling. Update README.md to reflect new features and usage instructions. Remove dotenv dependency from requirements.txt.

Browse files

Files changed (3) hide show

README.md +23 -13
app.py +205 -256
requirements.txt +0 -1

README.md CHANGED Viewed

@@ -16,14 +16,13 @@ A modern conversational AI chatbot designed specifically for exploring and analy
 ## ✨ Latest Features
-- 📖 **Full Paper Access**: Access complete paper texts from the Papers directory
-- 🔍 **Intelligent Paper Search**: Automatically finds relevant papers based on user queries
-- 🧠 **Smart Conversation Memory**: Maintains chat history with intelligent truncation
 - 🚀 **Real-time Streaming**: Instant response streaming for better UX
 - 🎛️ **Multiple Model Selection**: Choose between GPT-4o, GPT-4o-mini, and GPT-3.5 Turbo
 - ⚙️ **Advanced Parameters**: Fine-tune temperature, max tokens, and top-p
 - 🎨 **Modern UI**: Responsive design with intuitive controls
-- 🔧 **Customizable System Messages**: Define AI personality and behavior
 - 🛡️ **Robust Error Handling**: Clear error messages for common issues
 - 📱 **Mobile Responsive**: Works great on all devices
@@ -45,11 +44,21 @@ pip install -r requirements.txt
 ### 3. Configure Environment
 #### For Local Development
-Create a `.env` file in the project root:
 ```bash
-# .env
-OPENAI_API_KEY=your_openai_api_key_here
 ```
 #### For Hugging Face Spaces Deployment
@@ -78,10 +87,11 @@ The chatbot will be available at `http://localhost:7860`
 ## 🎯 Usage Guide
 ### Basic Paper Exploration
-1. **Ask about specific topics**: "Find papers about transformer architecture"
-2. **Request full papers**: "Show me the full paper about pigs"
-3. **Get detailed information**: "What's the conclusion of the pig disease paper?"
-4. **Ask for quotes**: "Quote the methodology section from the pig paper"
 ### Advanced Controls
@@ -172,7 +182,7 @@ Papers/
 ### Common Issues
 **API Key Errors**
-- Ensure your `.env` file contains a valid OpenAI API key
 - Check that the API key has sufficient credits
 - For Hugging Face Spaces: Verify the secret is named `OPENAI_API_KEY`
@@ -194,7 +204,7 @@ Papers/
 - Long conversations are automatically truncated
 ### Error Messages
-- **"Invalid API key"**: Check your `.env` file or Hugging Face Spaces secrets
 - **"Quota exceeded"**: Add credits to your OpenAI account
 - **"Rate limit"**: Wait and retry
 - **"Paper not found"**: Check that the paper file exists in the Papers directory

 ## ✨ Latest Features
+- 📖 **Smart Function Calling**: Intelligent paper retrieval using OpenAI's function calling API
+- 🔍 **Dynamic Paper Fetching**: Automatically fetches full paper texts when needed
+- 🧠 **Contextual Conversation Memory**: Maintains chat history with intelligent truncation
 - 🚀 **Real-time Streaming**: Instant response streaming for better UX
 - 🎛️ **Multiple Model Selection**: Choose between GPT-4o, GPT-4o-mini, and GPT-3.5 Turbo
 - ⚙️ **Advanced Parameters**: Fine-tune temperature, max tokens, and top-p
 - 🎨 **Modern UI**: Responsive design with intuitive controls
 - 🛡️ **Robust Error Handling**: Clear error messages for common issues
 - 📱 **Mobile Responsive**: Works great on all devices
 ### 3. Configure Environment
 #### For Local Development
+Set your OpenAI API key as an environment variable:
+**Windows (PowerShell):**
+```powershell
+$env:OPENAI_API_KEY="your_openai_api_key_here"
+```
+**Windows (Command Prompt):**
+```cmd
+set OPENAI_API_KEY=your_openai_api_key_here
+```
+**Linux/macOS:**
 ```bash
+export OPENAI_API_KEY="your_openai_api_key_here"
 ```
 #### For Hugging Face Spaces Deployment
 ## 🎯 Usage Guide
 ### Basic Paper Exploration
+1. **Ask about specific topics**: "What papers discuss AI's impact on employment?"
+2. **Request full papers**: "Show me the full paper about AI companions"
+3. **Get detailed information**: "What's the conclusion of the pig disease detection paper?"
+4. **Compare findings**: "Compare findings on AI in education"
+5. **Ask for specific details**: "What methodology did they use in the pig disease paper?"
 ### Advanced Controls
 ### Common Issues
 **API Key Errors**
+- Ensure your `OPENAI_API_KEY` environment variable is set correctly
 - Check that the API key has sufficient credits
 - For Hugging Face Spaces: Verify the secret is named `OPENAI_API_KEY`
 - Long conversations are automatically truncated
 ### Error Messages
+- **"Invalid API key"**: Check your environment variable or Hugging Face Spaces secrets
 - **"Quota exceeded"**: Add credits to your OpenAI account
 - **"Rate limit"**: Wait and retry
 - **"Paper not found"**: Check that the paper file exists in the Papers directory

app.py CHANGED Viewed

@@ -2,13 +2,9 @@ import gradio as gr
 import os
 import json
 import re
-from typing import Iterator, Dict, Any, List
 from openai import OpenAI
 from openai.types.chat import ChatCompletionChunk
-from dotenv import load_dotenv
-# Load environment variables
-load_dotenv()
 # Load abstracts content once at startup
 def load_abstracts_content():
@@ -19,108 +15,81 @@ def load_abstracts_content():
     except FileNotFoundError:
         return "Abstracts database not found."
-# Load full paper texts
-def load_paper_texts():
-    """Load all paper texts from the Papers directory."""
     papers = {}
     papers_dir = "Papers"
     if not os.path.exists(papers_dir):
-        return papers
-    for filename in os.listdir(papers_dir):
-        if filename.endswith('.txt'):
-            filepath = os.path.join(papers_dir, filename)
             try:
                 with open(filepath, "r", encoding="utf-8") as f:
-                    content = f.read()
-                    # Extract title from filename (remove .txt extension)
-                    title = filename[:-4]
-                    papers[title] = content
             except Exception as e:
-                print(f"Error loading {filename}: {e}")
     return papers
-# Load abstracts content globally
-ABSTRACTS_CONTENT = load_abstracts_content()
-PAPER_TEXTS = load_paper_texts()
-def search_papers(query: str, papers: Dict[str, str]) -> List[tuple[str, str, str]]:
-    """
-    Search through paper texts for relevant content.
-    Returns list of (title, content, relevance_score) tuples.
-    """
-    results = []
-    query_lower = query.lower()
-    for title, content in papers.items():
-        # Simple keyword matching - can be enhanced with more sophisticated search
-        relevance_score = 0
-        # Check if query terms appear in title
-        if any(term in title.lower() for term in query_lower.split()):
-            relevance_score += 10
-        # Check if query terms appear in content
-        content_lower = content.lower()
-        for term in query_lower.split():
-            if term in content_lower:
-                relevance_score += content_lower.count(term)
-        if relevance_score > 0:
-            # For full paper requests, include more content
-            if any(keyword in query.lower() for keyword in ["full paper", "complete paper", "entire paper", "show me the paper", "read the paper"]):
-                # Include more content for full paper requests
-                truncated_content = content[:8000] + "..." if len(content) > 8000 else content
-            else:
-                # Truncate content to first 2000 characters for context
-                truncated_content = content[:2000] + "..." if len(content) > 2000 else content
-            results.append((title, truncated_content, relevance_score))
-    # Sort by relevance score
-    results.sort(key=lambda x: x[2], reverse=True)
-    return results[:3]  # Return top 3 most relevant papers
-def get_relevant_papers_content(user_query: str) -> str:
-    """
-    Get relevant paper content based on user query.
-    """
-    if not PAPER_TEXTS:
-        return ""
-    relevant_papers = search_papers(user_query, PAPER_TEXTS)
-    if not relevant_papers:
-        return ""
-    content = "\n\n=== FULL PAPER CONTENT ===\n"
-    for title, paper_content, score in relevant_papers:
-        content += f"\n--- {title} ---\n"
-        content += paper_content
-        content += "\n" + "="*50 + "\n"
-    return content
-def get_full_paper_content(paper_title: str) -> str:
-    """
-    Get the full content of a specific paper by title.
-    """
-    if not PAPER_TEXTS:
-        return ""
-    # Try to find the paper by title (case-insensitive)
-    for title, content in PAPER_TEXTS.items():
-        if paper_title.lower() in title.lower() or title.lower() in paper_title.lower():
-            return f"\n\n=== FULL PAPER: {title} ===\n\n{content}"
-    return ""
 def extract_conclusion_from_paper(content: str) -> str:
-    """
-    Extract the conclusion section from a paper's content.
-    """
-    # Look for conclusion sections with more specific patterns
     conclusion_patterns = [
         "conclusion and future works",
         "conclusion and future work",
@@ -133,11 +102,9 @@ def extract_conclusion_from_paper(content: str) -> str:
     lines = content.split('\n')
     conclusion_start = -1
-    # First, try to find a proper conclusion section
     for i, line in enumerate(lines):
         line_lower = line.lower().strip()
         if any(pattern in line_lower for pattern in conclusion_patterns):
-            # Check if it's a section header
             if (line.isupper() or
                 line.strip().endswith(':') or
                 len(line.strip()) < 100 or
@@ -146,83 +113,29 @@ def extract_conclusion_from_paper(content: str) -> str:
                 break
     if conclusion_start != -1:
-        # Extract from conclusion start to acknowledgments or references
         conclusion_lines = []
         for line in lines[conclusion_start:]:
             line_stripped = line.strip()
-            # Stop at acknowledgments or references
             if (line_stripped.lower().startswith('acknowledgments') or
                 line_stripped.lower().startswith('references') or
                 line_stripped.startswith('--- Page')):
                 break
             conclusion_lines.append(line)
-        conclusion_text = '\n'.join(conclusion_lines)
-        return conclusion_text
-    # If no conclusion section found, look for the final paragraphs
-    # Find the last substantial paragraph (usually before references or acknowledgments)
-    lines_reversed = list(reversed(lines))
-    final_content_start = -1
-    for i, line in enumerate(lines_reversed):
-        line_stripped = line.strip()
-        # Skip empty lines and page markers
-        if (line_stripped and
-            not line_stripped.startswith('--- Page') and
-            not line_stripped.startswith('References') and
-            not line_stripped.lower().startswith('acknowledgments')):
-            # Look for the last substantial paragraph
-            if len(line_stripped) > 50:  # Substantial line
-                final_content_start = len(lines) - i
-                break
-    if final_content_start != -1:
-        # Get the last 1500 characters from the final content
-        final_content = '\n'.join(lines[final_content_start:])
-        return final_content[-1500:] if len(final_content) > 1500 else final_content
     # Fallback: return the last 1000 characters
     return content[-1000:] if len(content) > 1000 else content
-# Get API key with better error handling
-api_key = os.getenv("OPENAI_API_KEY")
-if not api_key:
-    print("⚠️  Warning: OPENAI_API_KEY environment variable not set!")
-    print("Please set your OpenAI API key as an environment variable.")
-    print("For Hugging Face Spaces: Add OPENAI_API_KEY as a repository secret")
-    print("For local development: Create a .env file with OPENAI_API_KEY=your_key")
-    # Create a dummy client for UI to load (will show error when used)
-    client = None
-else:
-    # Initialize OpenAI client with latest configuration
-    client = OpenAI(
-        api_key=api_key,
-        timeout=60.0,  # 60 second timeout
-        max_retries=3   # Retry failed requests up to 3 times
-    )
-# Available models
-AVAILABLE_MODELS = {
-    "GPT-4o-mini": "gpt-4o-mini",
-    "GPT-4o": "gpt-4o",
-    "GPT-3.5 Turbo": "gpt-3.5-turbo"
-}
 def truncate_conversation_history(messages: list, max_tokens: int = 8000) -> list:
-    """
-    Truncate conversation history to stay within token limits.
-    Keeps the most recent messages and system message.
-    """
-    if len(messages) <= 3:  # System + 1 user + 1 assistant
         return messages
-    # Always keep system message
     system_message = messages[0]
     conversation_messages = messages[1:]
-    # Keep only the most recent messages
-    while len(conversation_messages) > 6:  # Keep last 3 exchanges
         conversation_messages = conversation_messages[2:]
     return [system_message] + conversation_messages
@@ -236,65 +149,39 @@ def respond(
     top_p: float,
 ) -> Iterator[str]:
     """
-    Generate a response using OpenAI's latest models.
-    Maintains conversation history in the messages array with proper truncation.
     """
     if not client:
-        yield "❌ Error: OpenAI API key not configured. Please set the OPENAI_API_KEY environment variable or add it as a repository secret in Hugging Face Spaces."
         return
     if not message.strip():
         yield "Please enter a message to start the conversation."
         return
-    # Get relevant full paper content based on user query
-    relevant_papers_content = get_relevant_papers_content(message)
-    # Check if user is asking for a specific paper (e.g., "show me the full paper about pigs")
-    specific_paper_content = ""
-    conclusion_content = ""
-    if any(keyword in message.lower() for keyword in ["full paper", "complete paper", "entire paper", "show me the paper", "read the paper"]):
-        # Try to find specific paper content
-        for title in PAPER_TEXTS.keys():
-            if any(term in title.lower() for term in message.lower().split()):
-                specific_paper_content = get_full_paper_content(title)
-                break
-    # Check if user is asking for conclusions specifically
-    if any(keyword in message.lower() for keyword in ["conclusion", "conclusions", "what's the conclusion", "what is the conclusion"]):
-        for title, content in PAPER_TEXTS.items():
-            if any(term in title.lower() for term in message.lower().split()):
-                conclusion_text = extract_conclusion_from_paper(content)
-                conclusion_content = f"\n\n=== CONCLUSION FROM {title} ===\n\n{conclusion_text}"
-                break
-    # Initialize messages array with system message
-    # Use the pre-loaded abstracts content and relevant full papers
-    system_prompt = f"""You are an AI chatbot designed to help users explore and analyze AI research papers. Your primary function is to retrieve relevant papers and answer questions about them based solely on the provided paper database.
-Here is the current research paper database:
 {ABSTRACTS_CONTENT}
-{relevant_papers_content}
-{specific_paper_content}
-{conclusion_content}
-IMPORTANT INSTRUCTIONS:
-1. When users ask for specific details, conclusions, or quotes from papers, use the full paper content provided above to give accurate, detailed responses.
-2. If the full paper content is available, you can quote directly from it and provide comprehensive answers including conclusions, methodology, and specific findings.
-3. When asked for conclusions, look for sections titled "Conclusion", "Conclusions", or the final paragraphs of the paper.
-4. When asked for quotes, provide the exact text from the paper content provided.
-5. You can now access the complete text of papers and provide detailed information including conclusions, methodology, and specific quotes.
-6. If a user asks for the "full paper" or "complete paper", provide a comprehensive summary including all major sections (abstract, introduction, methodology, results, conclusions).
-7. When conclusion content is specifically provided, use that content to answer conclusion-related questions."""
     messages = [{"role": "system", "content": system_prompt}]
-    # Add conversation history to messages array
     for user_msg, assistant_msg in history:
         if user_msg and user_msg.strip():
             messages.append({"role": "user", "content": user_msg.strip()})
@@ -304,55 +191,134 @@ IMPORTANT INSTRUCTIONS:
     # Add current user message
     messages.append({"role": "user", "content": message.strip()})
-    # Truncate conversation if it gets too long
     messages = truncate_conversation_history(messages)
     try:
-        # Get the actual model identifier
         model = AVAILABLE_MODELS.get(model_name, "gpt-4o-mini")
-        # Generate response using the selected model
         response = client.chat.completions.create(
             model=model,
             messages=messages,
             max_tokens=max_tokens,
             temperature=temperature,
             top_p=top_p,
-            stream=True,
-            # Additional parameters for better control
-            presence_penalty=0.0,
-            frequency_penalty=0.0
         )
-        # Stream the response with proper error handling
-        response_text = ""
         for chunk in response:
-            if hasattr(chunk.choices[0], 'delta') and chunk.choices[0].delta.content is not None:
-                response_text += chunk.choices[0].delta.content
-                yield response_text
     except Exception as e:
         error_message = f"Error: {str(e)}"
         if "api_key" in str(e).lower():
-            error_message = "Error: Invalid or missing OpenAI API key. Please check your configuration."
         elif "quota" in str(e).lower():
-            error_message = "Error: API quota exceeded. Please check your OpenAI account."
         elif "rate" in str(e).lower():
-            error_message = "Error: Rate limit exceeded. Please wait a moment and try again."
         yield error_message
 def chat_fn(message, history, model_name, max_tokens, temperature, top_p):
-    """
-    Single function that handles the entire chat interaction.
-    This is the proper way to handle chatbot interfaces in Gradio.
-    """
     if not message.strip():
         return history
-    # Add user message to history
     history.append([message, ""])
-    # Generate response
     for response in respond(message, history[:-1], model_name, max_tokens, temperature, top_p):
         history[-1][1] = response
         yield history
@@ -361,7 +327,7 @@ def clear_history() -> tuple:
     """Clear the conversation history."""
     return [], ""
-# Create the Gradio interface with latest features
 with gr.Blocks(
     title="📚 AI Research Paper Chatbot",
     theme=gr.themes.Soft(),
@@ -376,21 +342,18 @@ with gr.Blocks(
         """
         # 📚 AI Research Paper Chatbot
-        Chat with an AI assistant designed to help you explore and analyze AI research papers. The chatbot maintains conversation history to provide contextual responses.
         **Features:**
-        - 📖 Research paper analysis and retrieval
-        - 💬 Conversation memory with smart truncation
-        - 🚀 Real-time streaming responses
         - 🎛️ Multiple model selection
-        - ⚙️ Customizable parameters
-        - 🎨 Modern, responsive UI
         """
     )
     with gr.Row():
         with gr.Column(scale=3):
-            # Chat interface
             chatbot = gr.Chatbot(
                 height=500,
                 show_label=False,
@@ -409,7 +372,6 @@ with gr.Blocks(
                 clear_btn = gr.Button("Clear", variant="secondary", scale=1)
         with gr.Column(scale=1):
-            # Control panel
             gr.Markdown("### ⚙️ Settings")
             model_dropdown = gr.Dropdown(
@@ -434,7 +396,7 @@ with gr.Blocks(
                 value=0.7,
                 step=0.1,
                 label="Temperature",
-                info="Creativity level (0.0 = focused, 2.0 = creative)"
             )
             top_p_slider = gr.Slider(
@@ -446,22 +408,20 @@ with gr.Blocks(
                 info="Response diversity"
             )
-            # Example messages
             gr.Markdown("### 💡 Examples")
-            example_btn1 = gr.Button("Find papers about transformer architecture", size="sm")
-            example_btn2 = gr.Button("What are the latest developments in reinforcement learning?", size="sm")
-            example_btn3 = gr.Button("Summarize research on large language models", size="sm")
-            example_btn4 = gr.Button("Show me the full paper about pigs", size="sm")
-            example_btn5 = gr.Button("What's the conclusion of the pig disease paper?", size="sm")
-    # Simple event handling with proper chat function
     msg.submit(
         chat_fn,
         [msg, chatbot, model_dropdown, max_tokens_slider, temperature_slider, top_p_slider],
         [chatbot],
         show_progress=True
     ).then(
-        lambda: "",  # Clear input
         outputs=[msg]
     )
@@ -471,37 +431,26 @@ with gr.Blocks(
         [chatbot],
         show_progress=True
     ).then(
-        lambda: "",  # Clear input
         outputs=[msg]
     )
     clear_btn.click(clear_history, outputs=[chatbot, msg])
-    # Example button handlers
-    example_btn1.click(lambda: "Find papers about transformer architecture", outputs=msg)
-    example_btn2.click(lambda: "What are the latest developments in reinforcement learning?", outputs=msg)
-    example_btn3.click(lambda: "Summarize research on large language models", outputs=msg)
-    example_btn4.click(lambda: "Show me the full paper about pigs", outputs=msg)
-    example_btn5.click(lambda: "What's the conclusion of the pig disease paper?", outputs=msg)
 if __name__ == "__main__":
-    # Check if API key is set
     if not os.getenv("OPENAI_API_KEY"):
         print("⚠️  Warning: OPENAI_API_KEY environment variable not set!")
-        print("Please set your OpenAI API key as an environment variable.")
-        print("For Hugging Face Spaces: Add OPENAI_API_KEY as a repository secret")
-        print("For local development: Create a .env file with OPENAI_API_KEY=your_key")
-        print("\nTo get an API key:")
-        print("1. Visit https://platform.openai.com/api-keys")
-        print("2. Sign in or create an account")
-        print("3. Generate a new API key")
-        print("4. Add it to your .env file or Hugging Face Spaces secrets")
-    # Launch with proper configuration for Hugging Face Spaces
     demo.launch(
         server_name="0.0.0.0",
         server_port=7860,
-        share=False,  # Disable sharing to avoid issues
         show_error=True,
         quiet=False
     )

 import os
 import json
 import re
+from typing import Iterator, Dict, Any, List, Optional
 from openai import OpenAI
 from openai.types.chat import ChatCompletionChunk
 # Load abstracts content once at startup
 def load_abstracts_content():
     except FileNotFoundError:
         return "Abstracts database not found."
+# Load abstracts content globally
+ABSTRACTS_CONTENT = load_abstracts_content()
+# Get API key with better error handling
+api_key = os.getenv("OPENAI_API_KEY")
+if not api_key:
+    print("⚠️  Warning: OPENAI_API_KEY environment variable not set!")
+    client = None
+else:
+    client = OpenAI(
+        api_key=api_key,
+        timeout=60.0,
+        max_retries=3
+    )
+# Available models
+AVAILABLE_MODELS = {
+    "GPT-4o-mini": "gpt-4o-mini",
+    "GPT-4o": "gpt-4o",
+    "GPT-3.5 Turbo": "gpt-3.5-turbo"
+}
+# Define the tool for fetching papers
+FETCH_PAPERS_TOOL = {
+    "type": "function",
+    "function": {
+        "name": "fetch_papers",
+        "description": "Fetch full text content of research papers by their filenames. Use this when you need detailed information, full text, conclusions, methodology, or specific quotes from papers.",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "filenames": {
+                    "type": "array",
+                    "items": {
+                        "type": "string"
+                    },
+                    "description": "List of paper filenames to fetch (e.g., ['The Labor Market Effects of Generativ.txt', 'AI Companions Reduce Loneliness.txt'])"
+                }
+            },
+            "required": ["filenames"]
+        }
+    }
+}
+def fetch_papers(filenames: List[str]) -> Dict[str, str]:
+    """
+    Fetch full paper texts by filenames.
+    Returns a dictionary mapping filename to content.
+    """
     papers = {}
     papers_dir = "Papers"
     if not os.path.exists(papers_dir):
+        return {"error": "Papers directory not found"}
+    for filename in filenames:
+        # Ensure .txt extension
+        if not filename.endswith('.txt'):
+            filename += '.txt'
+        filepath = os.path.join(papers_dir, filename)
+        if os.path.exists(filepath):
             try:
                 with open(filepath, "r", encoding="utf-8") as f:
+                    papers[filename] = f.read()
             except Exception as e:
+                papers[filename] = f"Error loading paper: {str(e)}"
+        else:
+            papers[filename] = f"Paper not found: {filename}"
     return papers
 def extract_conclusion_from_paper(content: str) -> str:
+    """Extract the conclusion section from a paper's content."""
     conclusion_patterns = [
         "conclusion and future works",
         "conclusion and future work",
     lines = content.split('\n')
     conclusion_start = -1
     for i, line in enumerate(lines):
         line_lower = line.lower().strip()
         if any(pattern in line_lower for pattern in conclusion_patterns):
             if (line.isupper() or
                 line.strip().endswith(':') or
                 len(line.strip()) < 100 or
                 break
     if conclusion_start != -1:
         conclusion_lines = []
         for line in lines[conclusion_start:]:
             line_stripped = line.strip()
             if (line_stripped.lower().startswith('acknowledgments') or
                 line_stripped.lower().startswith('references') or
                 line_stripped.startswith('--- Page')):
                 break
             conclusion_lines.append(line)
+        return '\n'.join(conclusion_lines)
     # Fallback: return the last 1000 characters
     return content[-1000:] if len(content) > 1000 else content
 def truncate_conversation_history(messages: list, max_tokens: int = 8000) -> list:
+    """Truncate conversation history to stay within token limits."""
+    if len(messages) <= 3:
         return messages
     system_message = messages[0]
     conversation_messages = messages[1:]
+    while len(conversation_messages) > 6:
         conversation_messages = conversation_messages[2:]
     return [system_message] + conversation_messages
     top_p: float,
 ) -> Iterator[str]:
     """
+    Generate a response using OpenAI's models with function calling.
     """
     if not client:
+        yield "❌ Error: OpenAI API key not configured."
         return
     if not message.strip():
         yield "Please enter a message to start the conversation."
         return
+    # Initialize messages with a concise system prompt
+    system_prompt = f"""You are an AI chatbot designed to help users explore and analyze AI research papers.
+You have access to:
+1. An abstracts database with summaries of research papers
+2. A tool to fetch full paper texts when needed
+ABSTRACTS DATABASE:
 {ABSTRACTS_CONTENT}
+INSTRUCTIONS:
+- Answer questions using the abstracts when possible
+- Use the fetch_papers tool when users ask for:
+  - Full papers or complete papers
+  - Specific details not in abstracts
+  - Conclusions, methodology, or quotes
+  - Any information requiring the full text
+- When fetching papers, use the exact filename from the abstracts table
+- Provide accurate, detailed responses based on the actual paper content"""
     messages = [{"role": "system", "content": system_prompt}]
+    # Add conversation history
     for user_msg, assistant_msg in history:
         if user_msg and user_msg.strip():
             messages.append({"role": "user", "content": user_msg.strip()})
     # Add current user message
     messages.append({"role": "user", "content": message.strip()})
+    # Truncate if needed
     messages = truncate_conversation_history(messages)
     try:
         model = AVAILABLE_MODELS.get(model_name, "gpt-4o-mini")
+        # Initial response with tool support
         response = client.chat.completions.create(
             model=model,
             messages=messages,
             max_tokens=max_tokens,
             temperature=temperature,
             top_p=top_p,
+            tools=[FETCH_PAPERS_TOOL],
+            tool_choice="auto",
+            stream=True
         )
+        # Collect the response and handle tool calls
+        full_response = ""
+        tool_calls = []
+        current_tool_call = None
         for chunk in response:
+            if hasattr(chunk.choices[0], 'delta'):
+                delta = chunk.choices[0].delta
+                # Handle regular content
+                if delta.content is not None:
+                    full_response += delta.content
+                    yield full_response
+                # Handle tool calls
+                if delta.tool_calls:
+                    for tool_call_chunk in delta.tool_calls:
+                        if tool_call_chunk.id:
+                            # New tool call
+                            if current_tool_call:
+                                tool_calls.append(current_tool_call)
+                            current_tool_call = {
+                                "id": tool_call_chunk.id,
+                                "type": "function",
+                                "function": {
+                                    "name": tool_call_chunk.function.name if tool_call_chunk.function else "",
+                                    "arguments": ""
+                                }
+                            }
+                        if current_tool_call and tool_call_chunk.function:
+                            if tool_call_chunk.function.arguments:
+                                current_tool_call["function"]["arguments"] += tool_call_chunk.function.arguments
+        # Add final tool call if exists
+        if current_tool_call:
+            tool_calls.append(current_tool_call)
+        # Process tool calls if any
+        if tool_calls:
+            # Add the assistant's message with tool calls
+            messages.append({
+                "role": "assistant",
+                "content": full_response if full_response else None,
+                "tool_calls": tool_calls
+            })
+            # Execute tool calls
+            for tool_call in tool_calls:
+                function_name = tool_call["function"]["name"]
+                if function_name == "fetch_papers":
+                    try:
+                        # Parse arguments
+                        arguments = json.loads(tool_call["function"]["arguments"])
+                        filenames = arguments.get("filenames", [])
+                        # Fetch papers
+                        papers_content = fetch_papers(filenames)
+                        # Add tool response to messages
+                        tool_response = {
+                            "role": "tool",
+                            "tool_call_id": tool_call["id"],
+                            "content": json.dumps(papers_content)
+                        }
+                        messages.append(tool_response)
+                    except Exception as e:
+                        tool_response = {
+                            "role": "tool",
+                            "tool_call_id": tool_call["id"],
+                            "content": f"Error: {str(e)}"
+                        }
+                        messages.append(tool_response)
+            # Get final response with tool results
+            final_response = client.chat.completions.create(
+                model=model,
+                messages=messages,
+                max_tokens=max_tokens,
+                temperature=temperature,
+                top_p=top_p,
+                stream=True
+            )
+            # Stream the final response
+            final_text = ""
+            for chunk in final_response:
+                if hasattr(chunk.choices[0], 'delta') and chunk.choices[0].delta.content is not None:
+                    final_text += chunk.choices[0].delta.content
+                    yield full_response + "\n\n" + final_text if full_response else final_text
     except Exception as e:
         error_message = f"Error: {str(e)}"
         if "api_key" in str(e).lower():
+            error_message = "Error: Invalid or missing OpenAI API key."
         elif "quota" in str(e).lower():
+            error_message = "Error: API quota exceeded."
         elif "rate" in str(e).lower():
+            error_message = "Error: Rate limit exceeded."
         yield error_message
 def chat_fn(message, history, model_name, max_tokens, temperature, top_p):
+    """Handle the entire chat interaction."""
     if not message.strip():
         return history
     history.append([message, ""])
     for response in respond(message, history[:-1], model_name, max_tokens, temperature, top_p):
         history[-1][1] = response
         yield history
     """Clear the conversation history."""
     return [], ""
+# Create the Gradio interface
 with gr.Blocks(
     title="📚 AI Research Paper Chatbot",
     theme=gr.themes.Soft(),
         """
         # 📚 AI Research Paper Chatbot
+        Chat with an AI assistant that can intelligently retrieve and analyze research papers.
         **Features:**
+        - 📖 Smart paper retrieval using function calling
+        - 💬 Contextual conversation memory
+        - 🚀 Efficient token usage
         - 🎛️ Multiple model selection
         """
     )
     with gr.Row():
         with gr.Column(scale=3):
             chatbot = gr.Chatbot(
                 height=500,
                 show_label=False,
                 clear_btn = gr.Button("Clear", variant="secondary", scale=1)
         with gr.Column(scale=1):
             gr.Markdown("### ⚙️ Settings")
             model_dropdown = gr.Dropdown(
                 value=0.7,
                 step=0.1,
                 label="Temperature",
+                info="Creativity level"
             )
             top_p_slider = gr.Slider(
                 info="Response diversity"
             )
             gr.Markdown("### 💡 Examples")
+            example_btn1 = gr.Button("What papers discuss AI's impact on employment?", size="sm")
+            example_btn2 = gr.Button("Show me the full paper about AI companions", size="sm")
+            example_btn3 = gr.Button("What's the conclusion of the pig disease detection paper?", size="sm")
+            example_btn4 = gr.Button("Compare findings on AI in education", size="sm")
+    # Event handlers
     msg.submit(
         chat_fn,
         [msg, chatbot, model_dropdown, max_tokens_slider, temperature_slider, top_p_slider],
         [chatbot],
         show_progress=True
     ).then(
+        lambda: "",
         outputs=[msg]
     )
         [chatbot],
         show_progress=True
     ).then(
+        lambda: "",
         outputs=[msg]
     )
     clear_btn.click(clear_history, outputs=[chatbot, msg])
+    # Example handlers
+    example_btn1.click(lambda: "What papers discuss AI's impact on employment?", outputs=msg)
+    example_btn2.click(lambda: "Show me the full paper about AI companions", outputs=msg)
+    example_btn3.click(lambda: "What's the conclusion of the pig disease detection paper?", outputs=msg)
+    example_btn4.click(lambda: "Compare findings on AI in education", outputs=msg)
 if __name__ == "__main__":
     if not os.getenv("OPENAI_API_KEY"):
         print("⚠️  Warning: OPENAI_API_KEY environment variable not set!")
     demo.launch(
         server_name="0.0.0.0",
         server_port=7860,
+        share=False,
         show_error=True,
         quiet=False
     )

requirements.txt CHANGED Viewed

@@ -1,4 +1,3 @@
 openai>=1.98.0
 gradio==4.44.0
-python-dotenv>=1.0.0
 pydantic==2.10.6

 openai>=1.98.0
 gradio==4.44.0
 pydantic==2.10.6