File size: 7,596 Bytes
5150628
a213258
 
 
 
5150628
 
 
 
a213258
5150628
 
a213258
 
 
 
 
 
028ef27
 
 
a213258
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
028ef27
a213258
028ef27
 
 
 
 
 
 
 
 
 
 
a213258
028ef27
a213258
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
028ef27
 
 
 
 
a213258
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
028ef27
a213258
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
028ef27
a213258
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
---
title: AI Research Paper Chatbot
emoji: πŸ“š
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: false
app_port: 7860
---

# πŸ“š AI Research Paper Chatbot

A modern conversational AI chatbot designed specifically for exploring and analyzing AI research papers. Features full paper text access, conversation memory, real-time streaming, and intelligent paper search.

## ✨ Latest Features

- πŸ“– **Smart Function Calling**: Intelligent paper retrieval using OpenAI's function calling API
- πŸ” **Dynamic Paper Fetching**: Automatically fetches full paper texts when needed
- 🧠 **Contextual Conversation Memory**: Maintains chat history with intelligent truncation
- πŸš€ **Real-time Streaming**: Instant response streaming for better UX
- πŸŽ›οΈ **Multiple Model Selection**: Choose between GPT-4o, GPT-4o-mini, and GPT-3.5 Turbo
- βš™οΈ **Advanced Parameters**: Fine-tune temperature, max tokens, and top-p
- 🎨 **Modern UI**: Responsive design with intuitive controls
- πŸ›‘οΈ **Robust Error Handling**: Clear error messages for common issues
- πŸ“± **Mobile Responsive**: Works great on all devices

## πŸš€ Quick Start

### 1. Install Dependencies

```bash
pip install -r requirements.txt
```

### 2. Get OpenAI API Key

1. Visit [OpenAI Platform](https://platform.openai.com/api-keys)
2. Create an account or sign in
3. Generate a new API key
4. Copy the API key

### 3. Configure Environment

#### For Local Development
Set your OpenAI API key as an environment variable:

**Windows (PowerShell):**
```powershell
$env:OPENAI_API_KEY="your_openai_api_key_here"
```

**Windows (Command Prompt):**
```cmd
set OPENAI_API_KEY=your_openai_api_key_here
```

**Linux/macOS:**
```bash
export OPENAI_API_KEY="your_openai_api_key_here"
```

#### For Hugging Face Spaces Deployment
1. Go to your Space settings
2. Click on "Settings" tab
3. Scroll down to "Repository secrets"
4. Click "New secret"
5. **Name**: `OPENAI_API_KEY`
6. **Value**: Your actual OpenAI API key
7. Click "Add secret"

**Important**: Replace `your_openai_api_key_here` with your actual OpenAI API key.

### 4. Add Your Papers

Place your research paper text files in the `Papers/` directory. The system will automatically load all `.txt` files.

### 5. Run the Application

```bash
python app.py
```

The chatbot will be available at `http://localhost:7860`

## 🎯 Usage Guide

### Basic Paper Exploration
1. **Ask about specific topics**: "What papers discuss AI's impact on employment?"
2. **Request full papers**: "Show me the full paper about AI companions"
3. **Get detailed information**: "What's the conclusion of the pig disease detection paper?"
4. **Compare findings**: "Compare findings on AI in education"
5. **Ask for specific details**: "What methodology did they use in the pig disease paper?"

### Advanced Controls

#### Model Selection
- **GPT-4o-mini**: Fast, cost-effective (default)
- **GPT-4o**: Most capable, higher cost
- **GPT-3.5 Turbo**: Fastest, most affordable

#### Parameter Tuning
- **System Message**: Define AI personality and behavior
- **Max Tokens**: Control response length (1-4096)
- **Temperature**: Adjust creativity (0.0 = focused, 2.0 = creative)
- **Top-p**: Control response diversity (0.0-1.0)

#### Conversation Management
- **Clear Button**: Reset conversation history
- **Example Buttons**: Quick-start with sample messages

## πŸ“š Paper Database Features

### Automatic Paper Loading
- All `.txt` files in the `Papers/` directory are automatically loaded
- Paper titles are extracted from filenames
- Full text content is available for detailed analysis

### Intelligent Search
- **Keyword Matching**: Finds papers based on user query terms
- **Relevance Scoring**: Ranks papers by relevance to the query
- **Context-Aware**: Provides relevant paper excerpts for detailed responses

### Full Paper Access
- **Complete Text**: Access entire paper content when requested
- **Direct Quotes**: Get exact quotes from papers
- **Detailed Analysis**: Comprehensive answers including conclusions and methodology

## πŸ”§ Technical Details

### Latest OpenAI API Features
- **OpenAI SDK v1.98.0+**: Latest API patterns and features
- **Streaming Responses**: Real-time token streaming
- **Smart Retry Logic**: Automatic retry on failures
- **Timeout Handling**: 60-second request timeout
- **Error Classification**: Specific error messages for different issues

### Paper Processing
- **Automatic Loading**: Papers loaded at startup for fast access
- **Smart Search**: Keyword-based relevance scoring
- **Content Truncation**: Intelligent content selection for context
- **Full Text Access**: Complete paper retrieval when needed

### Conversation Memory
- **Intelligent Truncation**: Keeps recent messages while staying within limits
- **System Message Preservation**: Always maintains AI personality
- **Context Awareness**: Full conversation history for contextual responses

### Performance Optimizations
- **Async Processing**: Non-blocking UI during API calls
- **Memory Management**: Efficient conversation history handling
- **Error Recovery**: Graceful handling of API failures

## πŸ› οΈ Configuration

### Environment Variables
```bash
OPENAI_API_KEY=your_api_key_here
```

### Model Parameters
```python
# Available models
AVAILABLE_MODELS = {
    "GPT-4o-mini": "gpt-4o-mini",
    "GPT-4o": "gpt-4o", 
    "GPT-3.5 Turbo": "gpt-3.5-turbo"
}
```

### Paper Directory Structure
```
Papers/
β”œβ”€β”€ Paper Title 1.txt
β”œβ”€β”€ Paper Title 2.txt
└── ...
```

## πŸ› Troubleshooting

### Common Issues

**API Key Errors**
- Ensure your `OPENAI_API_KEY` environment variable is set correctly
- Check that the API key has sufficient credits
- For Hugging Face Spaces: Verify the secret is named `OPENAI_API_KEY`

**Paper Loading Issues**
- Ensure papers are in `.txt` format
- Check that the `Papers/` directory exists
- Verify file encoding (UTF-8 recommended)

**Rate Limiting**
- Wait a moment and try again
- Consider using a different model

**Connection Issues**
- Check your internet connection
- Verify OpenAI API status at https://status.openai.com

**Memory Issues**
- Conversation history is maintained in memory during the session
- Long conversations are automatically truncated

### Error Messages
- **"Invalid API key"**: Check your environment variable or Hugging Face Spaces secrets
- **"Quota exceeded"**: Add credits to your OpenAI account
- **"Rate limit"**: Wait and retry
- **"Paper not found"**: Check that the paper file exists in the Papers directory

## πŸ“Š Model Comparison

| Model | Speed | Cost | Capability | Best For |
|-------|-------|------|------------|----------|
| GPT-4o-mini | Fast | Low | Good | General chat, quick responses |
| GPT-4o | Medium | High | Excellent | Complex tasks, detailed analysis |
| GPT-3.5 Turbo | Fastest | Lowest | Good | Simple queries, high volume |

## πŸ”„ Recent Updates

- βœ… Added full paper text access functionality
- βœ… Implemented intelligent paper search
- βœ… Added automatic paper loading from Papers directory
- βœ… Enhanced system prompt with paper content
- βœ… Added example buttons for paper exploration
- βœ… Updated to OpenAI SDK v1.98.0+
- βœ… Added multiple model selection
- βœ… Improved error handling and messages
- βœ… Enhanced conversation memory management
- βœ… Added smart conversation truncation
- βœ… Modernized UI with better responsive design
- βœ… Fixed Pydantic compatibility issues
- βœ… Improved Hugging Face Spaces deployment

## πŸ“ License

This project is open source and available under the MIT License.