File size: 3,422 Bytes
e22d944
 
 
 
 
 
 
 
 
d08ac50
 
 
 
e22d944
 
a422c4e
c44caaa
 
 
a422c4e
 
 
 
c44caaa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a422c4e
c44caaa
 
 
 
 
a422c4e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
title: "My RecoFM AI Agent Demo"
emoji: "🎬"
colorFrom: pink
colorTo: red
sdk: gradio
sdk_version: 4.31.0
app_file: app.py
license: apache-2.0
tags:
  - agent-demo-track
  - recommender-system
  - gradio
---

# Movie Recommender System

# Tag: **agent-demo-track**

A hybrid movie recommender system that combines collaborative filtering, language model embeddings, and graph convolutional networks to provide personalized movie recommendations.

## Features

### Dual Embedding Types

- **Pure Language Model (LLM) Embeddings**  
  Generated for each movie title using Mistral AI.

- **Graph-Enhanced Embeddings (LLM + GCL)**  
  Combines language understanding with user interaction patterns to enrich the embeddings.

---

### Hybrid Input

- **Movie Selection**  
  Select movies you've previously enjoyed.

- **Natural Language Query**  
  Describe the kind of movie you're looking for in natural language.

- **Weight Adjustment (α)**  
  Adjust the balance between your movie selections and your text description to personalize the recommendations.

---

### Algorithm

- **Embedding Aggregation**  
  Convert the user preference into an embedding and aggregate it with embeddings of previously watched movies to create a query embedding.

- **Retrieval Phase**  
  Retrieve the top 100 candidate movies based on cosine similarity between the query embedding and movie embeddings.

- **Ranking Phase**  
  Use an AI agent to rank the top 100 candidates and select the final top 10 recommendations, considering:
  - User preferences  
  - Viewing history  
  - Weight parameter (α)
## Requirements

1. Python 3.8+
2. Virtual environment (recommended)
3. Mistral AI API key (get one at https://console.mistral.ai/)

Install the required packages:

```bash
pip install -r requirements.txt
```

## Environment Setup

1. Create a `.env` file in the project root:
```bash
MISTRAL_API_KEY=your_api_key_here
```

2. Ensure you have the necessary data files in the `amazon_movies_2023` directory:
   - `title_embeddings.npz`: Movie title embeddings from Mistral AI
   - `gcl_embeddings.npz`: Graph-enhanced embeddings
   - `title_embeddings_mapping.csv`: Movie metadata mapping

## Usage

1. Activate your virtual environment:
```bash
source venv/bin/activate  # On Unix/macOS
```

2. Run the recommender app:
```bash
python movie_recommender_app.py
```

3. Open your browser to the local URL shown in the terminal (typically http://127.0.0.1:7860)

## How It Works

1. **Movie Selection:**
   - Search and select up to 5 movies you've enjoyed
   - The system uses these as a baseline for your taste

2. **Text Preferences:**
   - Describe what you're looking for (e.g., "A thrilling sci-fi movie with deep philosophical themes")
   - Your description is converted to embeddings using Mistral AI

3. **Preference Weighting:**
   - Use the α slider to balance between your selected movies and text description
   - α = 0: Only use movie history
   - α = 1: Only use text description
   - Values in between combine both signals

4. **Embedding Types:**
   - LLM: Pure language model embeddings for semantic understanding
   - LLM + GCL: Graph-enhanced embeddings that also consider user interaction patterns

## Data Processing

For information about the dataset processing pipeline, see [DATA_PROCESSING.md](DATA_PROCESSING.md)

## Contributing

Feel free to open issues or submit pull requests with improvements!