# Conversation History Feature

## Overview
The backend now supports **multi-turn conversations** with persistent history across requests. Each conversation is tracked using a unique `session_id`.

## Features
- ✅ Session-based conversation tracking
- ✅ Automatic session cleanup (1 hour timeout by default)
- ✅ Message history trimming (max 20 messages per session)
- ✅ Works with `/ask`, `/ask_stream`, and `/code_assist` endpoints
- ✅ History management endpoints

## How to Use

### 1. Starting a Conversation (without session_id)
The backend will automatically create a new session:

```bash
curl -X POST http://127.0.0.1:7860/ask \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Hello, can you help me write a Python function?",
    "metadata": {"skill": "moderate_learner", "emotion": "neutral"}
  }'
```

Response includes `session_id`:
```json
{
  "reply": "I'd be happy to help...",
  "session_id": "550e8400-e29b-41d4-a716-446655440000"
}
```

### 2. Continuing a Conversation (with session_id)
Pass the `session_id` to continue the conversation:

```bash
curl -X POST http://127.0.0.1:7860/ask \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Can you make it handle edge cases?",
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "metadata": {"skill": "moderate_learner", "emotion": "neutral"}
  }'
```

The AI will remember the previous context (the function you were discussing).

### 3. Streaming Responses
Works the same way with `/ask_stream`:

```bash
curl -X POST http://127.0.0.1:7860/ask_stream \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Explain how this works",
    "session_id": "550e8400-e29b-41d4-a716-446655440000"
  }'
```

The session_id is sent as an SSE event: `event: session\ndata: <session_id>\n\n`

### 4. Code Assistance with History
Use `/code_assist` with conversation history:

```bash
curl -X POST http://127.0.0.1:7860/code_assist \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Write a function to calculate fibonacci",
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "metadata": {"skill": "moderate_learner", "emotion": "curious"}
  }'
```

## History Management Endpoints

### Get Conversation History
```bash
curl -X POST http://127.0.0.1:7860/get_history \
  -H "Content-Type: application/json" \
  -d '{"session_id": "550e8400-e29b-41d4-a716-446655440000"}'
```

Response:
```json
{
  "session_id": "550e8400-...",
  "messages": [
    {"role": "system", "content": "You are CodeMate..."},
    {"role": "user", "content": "Hello..."},
    {"role": "assistant", "content": "I'd be happy..."}
  ],
  "message_count": 5,
  "last_active": 1699459200.123
}
```

### Clear History for a Session
```bash
curl -X POST http://127.0.0.1:7860/clear_history \
  -H "Content-Type: application/json" \
  -d '{"session_id": "550e8400-e29b-41d4-a716-446655440000"}'
```

### Clear All Sessions
```bash
curl -X POST http://127.0.0.1:7860/clear_history \
  -H "Content-Type: application/json" \
  -d '{}'
```

### List All Active Sessions
```bash
curl http://127.0.0.1:7860/list_sessions
```

Response:
```json
{
  "sessions": [
    {
      "session_id": "550e8400-...",
      "message_count": 5,
      "last_active": 1699459200.123
    }
  ],
  "total_count": 1
}
```

## Configuration (Environment Variables)

```bash
# Session timeout in seconds (default: 3600 = 1 hour)
HISTORY_TIMEOUT_SECONDS=3600

# Maximum messages per session (default: 20)
MAX_HISTORY_LENGTH=20
```

## Architecture

### Message Format
Messages follow the OpenAI chat format:
```python
{
  "role": "system" | "user" | "assistant",
  "content": "message text"
}
```

### Storage
- In-memory storage using `defaultdict`
- Auto-cleanup of old sessions when > 100 sessions exist
- Each session stores messages + last_active timestamp

### Trimming
When a session exceeds `MAX_HISTORY_LENGTH`:
- System message is preserved
- Only recent messages are kept
- Ensures token limits aren't exceeded

## Example Multi-Turn Conversation

```bash
# Turn 1
curl -X POST http://127.0.0.1:7860/ask \
  -H "Content-Type: application/json" \
  -d '{"query": "Write a Python function to reverse a string"}'

# Response: {"reply": "Here's a function...", "session_id": "abc-123"}

# Turn 2 (AI remembers the function from Turn 1)
curl -X POST http://127.0.0.1:7860/ask \
  -H "Content-Type: application/json" \
  -d '{"query": "Can you add error handling to it?", "session_id": "abc-123"}'

# Turn 3 (AI remembers both previous turns)
curl -X POST http://127.0.0.1:7860/ask \
  -H "Content-Type: application/json" \
  -d '{"query": "Now make it work with unicode", "session_id": "abc-123"}'
```

## Notes
- Sessions are stored in memory (cleared on server restart)
- For production, consider using Redis or a database
- The cache still works for identical first messages
- Conversation history is NOT cached (each message with history is unique)