Paperbot6 / README.md
Ina-Shapiro's picture
Refactor app.py to enhance paper fetching functionality and improve error handling. Update README.md to reflect new features and usage instructions. Remove dotenv dependency from requirements.txt.
028ef27
---
title: AI Research Paper Chatbot
emoji: πŸ“š
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: false
app_port: 7860
---
# πŸ“š AI Research Paper Chatbot
A modern conversational AI chatbot designed specifically for exploring and analyzing AI research papers. Features full paper text access, conversation memory, real-time streaming, and intelligent paper search.
## ✨ Latest Features
- πŸ“– **Smart Function Calling**: Intelligent paper retrieval using OpenAI's function calling API
- πŸ” **Dynamic Paper Fetching**: Automatically fetches full paper texts when needed
- 🧠 **Contextual Conversation Memory**: Maintains chat history with intelligent truncation
- πŸš€ **Real-time Streaming**: Instant response streaming for better UX
- πŸŽ›οΈ **Multiple Model Selection**: Choose between GPT-4o, GPT-4o-mini, and GPT-3.5 Turbo
- βš™οΈ **Advanced Parameters**: Fine-tune temperature, max tokens, and top-p
- 🎨 **Modern UI**: Responsive design with intuitive controls
- πŸ›‘οΈ **Robust Error Handling**: Clear error messages for common issues
- πŸ“± **Mobile Responsive**: Works great on all devices
## πŸš€ Quick Start
### 1. Install Dependencies
```bash
pip install -r requirements.txt
```
### 2. Get OpenAI API Key
1. Visit [OpenAI Platform](https://platform.openai.com/api-keys)
2. Create an account or sign in
3. Generate a new API key
4. Copy the API key
### 3. Configure Environment
#### For Local Development
Set your OpenAI API key as an environment variable:
**Windows (PowerShell):**
```powershell
$env:OPENAI_API_KEY="your_openai_api_key_here"
```
**Windows (Command Prompt):**
```cmd
set OPENAI_API_KEY=your_openai_api_key_here
```
**Linux/macOS:**
```bash
export OPENAI_API_KEY="your_openai_api_key_here"
```
#### For Hugging Face Spaces Deployment
1. Go to your Space settings
2. Click on "Settings" tab
3. Scroll down to "Repository secrets"
4. Click "New secret"
5. **Name**: `OPENAI_API_KEY`
6. **Value**: Your actual OpenAI API key
7. Click "Add secret"
**Important**: Replace `your_openai_api_key_here` with your actual OpenAI API key.
### 4. Add Your Papers
Place your research paper text files in the `Papers/` directory. The system will automatically load all `.txt` files.
### 5. Run the Application
```bash
python app.py
```
The chatbot will be available at `http://localhost:7860`
## 🎯 Usage Guide
### Basic Paper Exploration
1. **Ask about specific topics**: "What papers discuss AI's impact on employment?"
2. **Request full papers**: "Show me the full paper about AI companions"
3. **Get detailed information**: "What's the conclusion of the pig disease detection paper?"
4. **Compare findings**: "Compare findings on AI in education"
5. **Ask for specific details**: "What methodology did they use in the pig disease paper?"
### Advanced Controls
#### Model Selection
- **GPT-4o-mini**: Fast, cost-effective (default)
- **GPT-4o**: Most capable, higher cost
- **GPT-3.5 Turbo**: Fastest, most affordable
#### Parameter Tuning
- **System Message**: Define AI personality and behavior
- **Max Tokens**: Control response length (1-4096)
- **Temperature**: Adjust creativity (0.0 = focused, 2.0 = creative)
- **Top-p**: Control response diversity (0.0-1.0)
#### Conversation Management
- **Clear Button**: Reset conversation history
- **Example Buttons**: Quick-start with sample messages
## πŸ“š Paper Database Features
### Automatic Paper Loading
- All `.txt` files in the `Papers/` directory are automatically loaded
- Paper titles are extracted from filenames
- Full text content is available for detailed analysis
### Intelligent Search
- **Keyword Matching**: Finds papers based on user query terms
- **Relevance Scoring**: Ranks papers by relevance to the query
- **Context-Aware**: Provides relevant paper excerpts for detailed responses
### Full Paper Access
- **Complete Text**: Access entire paper content when requested
- **Direct Quotes**: Get exact quotes from papers
- **Detailed Analysis**: Comprehensive answers including conclusions and methodology
## πŸ”§ Technical Details
### Latest OpenAI API Features
- **OpenAI SDK v1.98.0+**: Latest API patterns and features
- **Streaming Responses**: Real-time token streaming
- **Smart Retry Logic**: Automatic retry on failures
- **Timeout Handling**: 60-second request timeout
- **Error Classification**: Specific error messages for different issues
### Paper Processing
- **Automatic Loading**: Papers loaded at startup for fast access
- **Smart Search**: Keyword-based relevance scoring
- **Content Truncation**: Intelligent content selection for context
- **Full Text Access**: Complete paper retrieval when needed
### Conversation Memory
- **Intelligent Truncation**: Keeps recent messages while staying within limits
- **System Message Preservation**: Always maintains AI personality
- **Context Awareness**: Full conversation history for contextual responses
### Performance Optimizations
- **Async Processing**: Non-blocking UI during API calls
- **Memory Management**: Efficient conversation history handling
- **Error Recovery**: Graceful handling of API failures
## πŸ› οΈ Configuration
### Environment Variables
```bash
OPENAI_API_KEY=your_api_key_here
```
### Model Parameters
```python
# Available models
AVAILABLE_MODELS = {
"GPT-4o-mini": "gpt-4o-mini",
"GPT-4o": "gpt-4o",
"GPT-3.5 Turbo": "gpt-3.5-turbo"
}
```
### Paper Directory Structure
```
Papers/
β”œβ”€β”€ Paper Title 1.txt
β”œβ”€β”€ Paper Title 2.txt
└── ...
```
## πŸ› Troubleshooting
### Common Issues
**API Key Errors**
- Ensure your `OPENAI_API_KEY` environment variable is set correctly
- Check that the API key has sufficient credits
- For Hugging Face Spaces: Verify the secret is named `OPENAI_API_KEY`
**Paper Loading Issues**
- Ensure papers are in `.txt` format
- Check that the `Papers/` directory exists
- Verify file encoding (UTF-8 recommended)
**Rate Limiting**
- Wait a moment and try again
- Consider using a different model
**Connection Issues**
- Check your internet connection
- Verify OpenAI API status at https://status.openai.com
**Memory Issues**
- Conversation history is maintained in memory during the session
- Long conversations are automatically truncated
### Error Messages
- **"Invalid API key"**: Check your environment variable or Hugging Face Spaces secrets
- **"Quota exceeded"**: Add credits to your OpenAI account
- **"Rate limit"**: Wait and retry
- **"Paper not found"**: Check that the paper file exists in the Papers directory
## πŸ“Š Model Comparison
| Model | Speed | Cost | Capability | Best For |
|-------|-------|------|------------|----------|
| GPT-4o-mini | Fast | Low | Good | General chat, quick responses |
| GPT-4o | Medium | High | Excellent | Complex tasks, detailed analysis |
| GPT-3.5 Turbo | Fastest | Lowest | Good | Simple queries, high volume |
## πŸ”„ Recent Updates
- βœ… Added full paper text access functionality
- βœ… Implemented intelligent paper search
- βœ… Added automatic paper loading from Papers directory
- βœ… Enhanced system prompt with paper content
- βœ… Added example buttons for paper exploration
- βœ… Updated to OpenAI SDK v1.98.0+
- βœ… Added multiple model selection
- βœ… Improved error handling and messages
- βœ… Enhanced conversation memory management
- βœ… Added smart conversation truncation
- βœ… Modernized UI with better responsive design
- βœ… Fixed Pydantic compatibility issues
- βœ… Improved Hugging Face Spaces deployment
## πŸ“ License
This project is open source and available under the MIT License.