Spaces:
Sleeping
Sleeping
| title: Mongodb Gemini Rag | |
| emoji: ♊️ | |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.31.5 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| # Atlas Vector Search Chat with MongoDB and Google Gemini | |
| Welcome to the Atlas Vector Search Chat! This application demonstrates how to use MongoDB [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/) with [Google Gemini](https://ai.google.dev/) for semantic search and retrieval tasks. | |
| ## Features | |
| - **Interactive Chat**: Ask questions related to the embedded documents. | |
| - **Vector Search**: Utilizes MongoDB Atlas Vector Search to find relevant documents based on similarity. | |
| - **Google Gemini Integration**: Embeds text and generates responses. | |
| ## Requirements | |
| - Python 3.7 or later | |
| - MongoDB Atlas account | |
| - Atlas cluster enabled with `0.0.0.0/0` connection and connetion string | |
| - Google Cloud account with access to Gemini | |
| ## Installation | |
| 1. **Clone the space**: | |
| - Click [...] and clone the space to your repo, make sure to input the variables: | |
| 3. **Set up environment variables**: | |
| - `GOOGLE_API_KEY`: Your Google API key for Gemini. | |
| - `MONGODB_ATLAS_URI`: Your MongoDB Atlas connection string. | |
| ## Running the Application | |
| 1. **Start the application**: | |
| ```bash | |
| python app.py | |
| ``` | |
| 2. **Access the interface**: | |
| Open your browser on the `App` tab. | |
| ## Vector Search Index Configuration | |
| To create a [vector search index](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-index/) on the `google-ai.embedded_docs` collection, use the following configuration: | |
| ``` | |
| { | |
| "fields": [ | |
| { | |
| "numDimensions": 768, | |
| "path": "embedding", | |
| "similarity": "cosine", | |
| "type": "vector" | |
| } | |
| ] | |
| } | |
| ``` | |
| ## MongoDB Trigger to Embed Results | |
| This Atlas enviroment use an Atlas Database trigger on collection `google-ai.embedded_docs` to capture any `insert` operation and embed the content as specified in this [article](https://www.mongodb.com/developer/products/atlas/semantic-search-mongodb-atlas-vector-search/). | |
| ``` | |
| // Get the API key from Realm's Values & Secrets | |
| const apiKey = context.values.get('google-api-key'); | |
| // Set up the URL for the Google Generative Language API - embedding endpoint | |
| const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:embedContent?key=${apiKey}`; | |
| // batch example | |
| // const url = `https://generativelanguage.googleapis.com/v1beta/models/embedding-001:batchEmbedContents?key=${apiKey}`; | |
| // Get the full document from the change event | |
| const doc = changeEvent.fullDocument; | |
| try { | |
| console.log(`Processing document with id: ${doc._id}`); | |
| // Prepare the request body | |
| const requestBody = `{ | |
| "model": "models/embedding-001", | |
| "content": { | |
| "parts":[{ | |
| "text": '${doc.content}'}]}}`; | |
| // Make the HTTP POST request | |
| const response = await context.http.post({ | |
| url: url, | |
| headers: { 'Content-Type': ['application/json'] }, | |
| body: requestBody | |
| }); | |
| // Parse the JSON response | |
| const responseData = EJSON.parse(response.body.text()); | |
| console.log(JSON.stringify(responseData)) | |
| if(response.statusCode === 200) { | |
| console.log("Successfully received embedding response from the API."); | |
| // Extract the embedding from the response | |
| const embedding = responseData.embedding.values; // Adjust based on actual response structure | |
| // Use the name of your MongoDB Atlas Cluster | |
| const collection = context.services.get("mongodb-atlas").db("google-ai").collection("embedded_docs"); | |
| // Update the document in MongoDB with the embedding | |
| const updateResult = await collection.updateOne( | |
| { _id: doc._id }, | |
| { $set: { embedding: embedding }} | |
| ); | |
| if(updateResult.modifiedCount === 1) { | |
| console.log("Successfully updated the document."); | |
| } else { | |
| console.log("Failed to update the document."); | |
| } | |
| } else { | |
| console.log(`Failed to receive embedding. Status code: ${response.statusCode} - ${JSON.stringify(response)}`); | |
| } | |
| } catch(err) { | |
| console.error(`Error making request to API: ${err}`); | |
| } | |
| ``` | |