Spaces:

Kshitij2604
/

Movie_Rec

Sleeping

App Files Files Community

Kshitij2604 commited on Mar 24, 2025

Commit

000c556

verified ·

1 Parent(s): 6610185

Upload 11 files

Browse files

Files changed (12) hide show

.gitattributes +1 -0
.gitignore +44 -0
README.md +58 -12
Untitled.ipynb +0 -0
app.py +467 -0
movie_dict.pkl +3 -0
movies.pkl +3 -0
packages.txt +1 -0
requirements.txt +7 -0
similarity.pkl +3 -0
tmdb_5000_credits.xls +3 -0
tmdb_5000_movies.xls +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tmdb_5000_credits.xls filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,44 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+*.egg-info/
+.installed.cfg
+*.egg
+# Streamlit
+.streamlit/
+# Virtual Environment
+venv/
+ENV/
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+# OS specific
+.DS_Store
+Thumbs.db
+# Large data files
+# Uncomment if you want to exclude large data files from git
+# *.pkl
+# *.csv
+# *.h5

README.md CHANGED Viewed

@@ -1,12 +1,58 @@
----
-title: Movie Rec
-emoji: 🏆
-colorFrom: yellow
-colorTo: green
-sdk: streamlit
-sdk_version: 1.43.2
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Movie Recommender System
+A Streamlit-based movie recommendation application that uses a hybrid approach combining content-based filtering and collaborative filtering.
+## Features
+- Content-based movie recommendations
+- Collaborative filtering based on user preferences
+- Beautiful, responsive UI
+- Movie details including posters, ratings, genres, and overviews
+- Wishlist functionality
+- Search history tracking
+## Deployment on Hugging Face Spaces
+This application is designed to be deployed on Hugging Face Spaces.
+### Setup Instructions
+1. Create a new Space on Hugging Face
+   - Select **Streamlit** as the SDK
+   - Set the Python version to 3.9+
+2. Upload the following files to your Space:
+   - `app.py`: Main application code
+   - `requirements.txt`: Dependencies
+   - `movie_dict.pkl`: Movie data dictionary
+   - `similarity.pkl`: Similarity matrix
+3. Configure the Space:
+   - Set a secret environment variable `TMDB_API_KEY` with your TMDB API key
+   - Allocate sufficient RAM (at least 4GB recommended due to the size of similarity matrix)
+4. Build the Space and wait for the deployment to complete
+## Local Development
+To run the application locally:
+```bash
+pip install -r requirements.txt
+streamlit run app.py
+```
+## Data Sources
+The application uses The Movie Database (TMDB) API for fetching movie details and posters.
+## Implementation Details
+- **Data Preprocessing**: The movie data is preprocessed and similarity scores are calculated based on movie features (genres, keywords, cast, crew, etc.)
+- **Recommendation Algorithm**: Uses cosine similarity for content-based filtering and combines it with collaborative filtering based on user's wishlist
+- **User Interface**: Built with Streamlit and custom CSS for a modern, responsive design
+- **Data Structures**: Uses a linked list for search history and a deque for wishlist management
+## License
+This project is open source and available under the MIT License.

Untitled.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

app.py ADDED Viewed

	@@ -0,0 +1,467 @@

+import streamlit as st
+import pickle
+import pandas as pd
+import numpy as np
+import requests
+from collections import deque
+import time
+import os
+from pathlib import Path
+# Set page configuration
+st.set_page_config(
+    page_title="Movie Recommender System",
+    page_icon="🎬",
+    layout="wide",
+    initial_sidebar_state="expanded"
+)
+# Apply custom CSS
+st.markdown("""
+<style>
+    .main-header {
+        font-size: 36px;
+        font-weight: bold;
+        color: #FF4B4B;
+        text-align: center;
+        margin-bottom: 20px;
+        padding: 20px;
+        background-color: #1E1E1E;
+        border-radius: 10px;
+    }
+    .sub-header {
+        font-size: 24px;
+        font-weight: bold;
+        color: #4B4BFF;
+        margin-top: 30px;
+        margin-bottom: 10px;
+    }
+    .movie-card {
+        background-color: #2E2E2E;
+        border-radius: 10px;
+        padding: 15px;
+        margin-bottom: 15px;
+        box-shadow: 0 4px 8px rgba(0, 0, 0, 0.2);
+    }
+    .rating-badge {
+        background-color: #FFD700;
+        color: #000;
+        padding: 5px 10px;
+        border-radius: 15px;
+        font-weight: bold;
+        display: inline-block;
+        margin-top: 5px;
+    }
+    .movie-title {
+        font-size: 18px;
+        font-weight: bold;
+        margin-bottom: 10px;
+        color: white;
+    }
+    .movie-info {
+        font-size: 14px;
+        margin-bottom: 5px;
+        color: #CCC;
+    }
+    .wishlist-btn {
+        background-color: #4CAF50;
+        color: white;
+        border: none;
+        padding: 8px 15px;
+        text-align: center;
+        text-decoration: none;
+        display: inline-block;
+        border-radius: 5px;
+        cursor: pointer;
+    }
+    .sidebar-content {
+        padding: 15px;
+        background-color: #262730;
+        border-radius: 10px;
+        margin-bottom: 20px;
+    }
+    .stApp {
+        max-width: 1200px;
+        margin: 0 auto;
+    }
+</style>
+""", unsafe_allow_html=True)
+# Get the directory where the script is located
+SCRIPT_DIR = Path(__file__).parent.absolute()
+# Define a Node for the Linked List
+class Node:
+    def __init__(self, data=None):
+        self.data = data
+        self.next = None
+# Define the Linked List class for the search history
+class LinkedList:
+    def __init__(self):
+        self.head = None
+    def append(self, data):
+        new_node = Node(data)
+        if self.head is None:
+            self.head = new_node
+        else:
+            current = self.head
+            while current.next:
+                current = current.next
+            current.next = new_node
+    def get_all(self):
+        history = []
+        current = self.head
+        while current:
+            history.append(current.data)
+            current = current.next
+        return history
+# Function to fetch movie details from TMDB
+def fetch_movie_details(movie_id):
+    # API key should ideally be stored as an environment variable
+    # For Hugging Face, set this in your Space settings
+    api_key = os.environ.get('TMDB_API_KEY', 'b75fe8f52c05acaed8865a54505ed806')
+    try:
+        response = requests.get(
+            f'https://api.themoviedb.org/3/movie/{movie_id}?api_key={api_key}&language=en-US')
+        data = response.json()
+        poster_path = data.get('poster_path', '')
+        poster_url = "https://image.tmdb.org/t/p/w500/" + poster_path if poster_path else "https://via.placeholder.com/500x750?text=No+Image+Available"
+        return {
+            'poster_url': poster_url,
+            'overview': data.get('overview', 'No overview available'),
+            'release_date': data.get('release_date', 'Unknown'),
+            'vote_average': data.get('vote_average', 0),
+            'genres': [genre['name'] for genre in data.get('genres', [])]
+        }
+    except Exception as e:
+        st.error(f"Error fetching movie details: {e}")
+        return {
+            'poster_url': "https://via.placeholder.com/500x750?text=Error+Loading+Image",
+            'overview': 'Error loading movie details',
+            'release_date': 'Unknown',
+            'vote_average': 0,
+            'genres': []
+        }
+# Function to recommend movies based on hybrid approach
+def recommend(movie, num_recommendations=6):
+    try:
+        # Get movie index
+        movie_index = movies[movies['title'] == movie].index[0]
+        # Get content-based similarity scores
+        content_distances = similarity[movie_index]
+        # Get collaborative filtering component (based on user preferences in wishlist if available)
+        if len(st.session_state.wishlist) > 0:
+            # Find similar movies to wishlist items
+            wishlist_indices = [movies[movies['title'] == wish_movie].index[0] for wish_movie in st.session_state.wishlist if wish_movie in movies['title'].values]
+            if wishlist_indices:
+                # Calculate average similarity to wishlist items
+                wishlist_similarity = np.mean([similarity[idx] for idx in wishlist_indices], axis=0)
+                # Combine content-based and collaborative filtering (weighted average)
+                combined_distances = 0.7 * content_distances + 0.3 * wishlist_similarity
+            else:
+                combined_distances = content_distances
+        else:
+            combined_distances = content_distances
+        # Get movie recommendations
+        movie_indices = sorted(list(enumerate(combined_distances)), reverse=True, key=lambda x: x[1])[1:num_recommendations+1]
+        recommended_movies = []
+        for i in movie_indices:
+            movie_id = movies.iloc[i[0]].movie_id
+            movie_title = movies.iloc[i[0]].title
+            movie_details = fetch_movie_details(movie_id)
+            recommended_movies.append({
+                'title': movie_title,
+                'id': movie_id,
+                'poster': movie_details['poster_url'],
+                'overview': movie_details['overview'],
+                'release_date': movie_details['release_date'],
+                'rating': movie_details['vote_average'],
+                'genres': movie_details['genres'],
+                'similarity_score': round(i[1] * 100, 1)
+            })
+        return recommended_movies
+    except Exception as e:
+        st.error(f"Error in recommendation algorithm: {e}")
+        return []
+# Load the movie data
+@st.cache_data
+def load_data():
+    try:
+        # Construct paths dynamically to work in different environments
+        movie_dict_path = os.path.join(SCRIPT_DIR, 'movie_dict.pkl')
+        similarity_path = os.path.join(SCRIPT_DIR, 'similarity.pkl')
+        # Check if files exist
+        if not os.path.exists(movie_dict_path):
+            st.error(f"File not found: {movie_dict_path}")
+            st.stop()
+        if not os.path.exists(similarity_path):
+            st.error(f"File not found: {similarity_path}")
+            st.stop()
+        # Load the data
+        with open(movie_dict_path, 'rb') as f:
+            movies_dict = pickle.load(f)
+        movies_df = pd.DataFrame(movies_dict)
+        with open(similarity_path, 'rb') as f:
+            similarity_matrix = pickle.load(f)
+        return movies_df, similarity_matrix
+    except Exception as e:
+        st.error(f"Error loading data: {e}")
+        st.stop()
+# Load data
+try:
+    movies, similarity = load_data()
+except Exception as e:
+    st.error(f"Error loading data: {e}")
+    st.stop()
+# Initialize the wishlist queue
+if 'wishlist' not in st.session_state:
+    st.session_state.wishlist = deque(maxlen=10)  # Limit to 10 movies
+# Initialize the search history linked list
+if 'search_history' not in st.session_state:
+    st.session_state.search_history = LinkedList()
+# Initialize other session states
+if 'show_recommendations' not in st.session_state:
+    st.session_state.show_recommendations = False
+if 'current_recommendations' not in st.session_state:
+    st.session_state.current_recommendations = []
+if 'tab' not in st.session_state:
+    st.session_state.tab = "recommend"
+# Create sidebar for user options
+with st.sidebar:
+    st.markdown('<div class="sidebar-content">', unsafe_allow_html=True)
+    # Use a direct URL for the image in Hugging Face
+    st.image("https://img.icons8.com/color/96/000000/film-reel.png", width=80)
+    st.markdown("## Movie Explorer")
+    # Navigation
+    selected_tab = st.radio("Navigation", ["Recommendations", "Wishlist", "History"])
+    if selected_tab == "Recommendations":
+        st.session_state.tab = "recommend"
+    elif selected_tab == "Wishlist":
+        st.session_state.tab = "wishlist"
+    else:
+        st.session_state.tab = "history"
+    st.markdown("## About")
+    st.info("This movie recommendation system uses a hybrid approach combining content-based filtering and collaborative filtering to provide personalized movie recommendations.")
+    # Add Hugging Face attribution
+    st.markdown("## Deployment")
+    st.success("Deployed on Hugging Face Spaces")
+    st.markdown("</div>", unsafe_allow_html=True)
+# Main content
+st.markdown('<h1 class="main-header">🎬 Movie Recommender System</h1>', unsafe_allow_html=True)
+# Recommendations Tab
+if st.session_state.tab == "recommend":
+    st.markdown('<h2 class="sub-header">Find Your Next Favorite Movie</h2>', unsafe_allow_html=True)
+    # Movie selection with autocomplete
+    col1, col2 = st.columns([3, 1])
+    with col1:
+        selected_movie_name = st.selectbox(
+            'Select a movie you like:',
+            movies['title'].values
+        )
+    with col2:
+        recommendation_button = st.button('Get Recommendations', type="primary")
+    # Display movie details for selected movie
+    if selected_movie_name:
+        movie_idx = movies[movies['title'] == selected_movie_name].index[0]
+        movie_id = movies.iloc[movie_idx].movie_id
+        movie_details = fetch_movie_details(movie_id)
+        col1, col2 = st.columns([1, 3])
+        with col1:
+            st.image(movie_details['poster_url'], width=200)
+        with col2:
+            st.markdown(f"### {selected_movie_name}")
+            st.markdown(f"**Released:** {movie_details['release_date']}")
+            st.markdown(f"**Rating:** {movie_details['vote_average']}/10")
+            st.markdown(f"**Genres:** {', '.join(movie_details['genres'])}")
+            st.markdown(f"**Overview:** {movie_details['overview']}")
+            # Add to wishlist button
+            if st.button('Add to Wishlist', key='add_wishlist'):
+                if selected_movie_name not in st.session_state.wishlist:
+                    st.session_state.wishlist.append(selected_movie_name)
+                    st.success(f'Added "{selected_movie_name}" to your wishlist!')
+                else:
+                    st.info(f'"{selected_movie_name}" is already in your wishlist!')
+    # Get and display recommendations
+    if recommendation_button:
+        with st.spinner('Finding the best movies for you...'):
+            # Simulate processing time for better UX
+            time.sleep(0.5)  # Reduced time for better performance on Hugging Face
+            # Add the movie to search history linked list
+            st.session_state.search_history.append(selected_movie_name)
+            # Get recommendations
+            st.session_state.current_recommendations = recommend(selected_movie_name)
+            st.session_state.show_recommendations = True
+    # Display recommendations
+    if st.session_state.show_recommendations:
+        st.markdown('<h2 class="sub-header">Recommended Movies</h2>', unsafe_allow_html=True)
+        if not st.session_state.current_recommendations:
+            st.warning("No recommendations found. Please try another movie.")
+        else:
+            # Display recommendations in a grid
+            cols = st.columns(3)  # 3 movies per row
+            for i, movie in enumerate(st.session_state.current_recommendations):
+                with cols[i % 3]:
+                    st.markdown(f"""
+                    <div class="movie-card">
+                        <div class="movie-title">{movie['title']}</div>
+                        <div class="rating-badge">⭐ {movie['rating']}/10</div>
+                        <div class="movie-info">Similarity: {movie['similarity_score']}%</div>
+                    </div>
+                    """, unsafe_allow_html=True)
+                    st.image(movie['poster'], width=200)
+                    with st.expander("Details"):
+                        st.write(f"**Release Date:** {movie['release_date']}")
+                        st.write(f"**Genres:** {', '.join(movie['genres'])}")
+                        st.write(f"**Overview:** {movie['overview']}")
+                    if st.button('Add to Wishlist', key=f'add_wish_{i}'):
+                        if movie['title'] not in st.session_state.wishlist:
+                            st.session_state.wishlist.append(movie['title'])
+                            st.success(f'Added "{movie["title"]}" to your wishlist!')
+                        else:
+                            st.info(f'"{movie["title"]}" is already in your wishlist!')
+# Wishlist Tab
+elif st.session_state.tab == "wishlist":
+    st.markdown('<h2 class="sub-header">Your Wishlist</h2>', unsafe_allow_html=True)
+    if len(st.session_state.wishlist) > 0:
+        # Display the wishlist with additional options
+        for i, movie in enumerate(list(st.session_state.wishlist)):
+            col1, col2, col3 = st.columns([1, 3, 1])
+            with col1:
+                try:
+                    movie_idx = movies[movies['title'] == movie].index[0]
+                    movie_id = movies.iloc[movie_idx].movie_id
+                    movie_details = fetch_movie_details(movie_id)
+                    st.image(movie_details['poster_url'], width=150)
+                except:
+                    st.image("https://via.placeholder.com/150x225?text=No+Image", width=150)
+            with col2:
+                st.markdown(f"### {movie}")
+                try:
+                    movie_idx = movies[movies['title'] == movie].index[0]
+                    movie_id = movies.iloc[movie_idx].movie_id
+                    movie_details = fetch_movie_details(movie_id)
+                    st.markdown(f"**Released:** {movie_details['release_date']}")
+                    st.markdown(f"**Rating:** {movie_details['vote_average']}/10")
+                    st.markdown(f"**Genres:** {', '.join(movie_details['genres'])}")
+                    with st.expander("Overview"):
+                        st.write(movie_details['overview'])
+                except:
+                    st.write("Details not available")
+            with col3:
+                if st.button("Remove", key=f"remove_{i}"):
+                    st.session_state.wishlist.remove(movie)
+                    st.experimental_rerun()
+                if st.button("Find Similar", key=f"similar_{i}"):
+                    st.session_state.tab = "recommend"
+                    with st.spinner('Finding similar movies...'):
+                        st.session_state.current_recommendations = recommend(movie)
+                        st.session_state.show_recommendations = True
+                    st.experimental_rerun()
+            st.markdown("---")
+        # Clear wishlist button
+        if st.button("Clear Wishlist"):
+            st.session_state.wishlist.clear()
+            st.success("Wishlist cleared!")
+            st.experimental_rerun()
+    else:
+        st.info("Your wishlist is empty. Add movies to your wishlist by clicking 'Add to Wishlist' on movie cards.")
+# History Tab
+else:
+    st.markdown('<h2 class="sub-header">Your Search History</h2>', unsafe_allow_html=True)
+    search_history_list = st.session_state.search_history.get_all()
+    if search_history_list:
+        # Display search history
+        for i, movie in enumerate(search_history_list):
+            col1, col2 = st.columns([4, 1])
+            with col1:
+                st.markdown(f"### {i+1}. {movie}")
+            with col2:
+                if st.button("Find Again", key=f"find_again_{i}"):
+                    st.session_state.tab = "recommend"
+                    with st.spinner('Getting recommendations...'):
+                        st.session_state.current_recommendations = recommend(movie)
+                        st.session_state.show_recommendations = True
+                    st.experimental_rerun()
+            st.markdown("---")
+    else:
+        st.info("No search history available. Start searching for movie recommendations to build your history.")

movie_dict.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3edb834fa65181717a94afccfcc6f05667e3aea8dc52d697cd49e7085721848b
+size 2156446

movies.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bdd31f65dad6f5370ba7408bb44867714575a314d1bbcd0e9729ef842f30e0ee
+size 2175040

packages.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ build-essential

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+streamlit==1.32.0
+pandas==2.0.3
+numpy==1.24.3
+requests==2.31.0
+scikit-learn==1.2.2
+pillow==10.1.0
+protobuf==4.23.4

similarity.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9e0798743053348f1075cb166e92bca6568ba504bc2fff99c58aa1feb5e54719
+size 184781251

tmdb_5000_credits.xls ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9d0050599ff88d40366c4841204b1489862bca346bfa46c20b05a65d14508435
+size 40044293

tmdb_5000_movies.xls ADDED Viewed

The diff for this file is too large to render. See raw diff