Spaces:

ArthurSrz
/

borges-graph

Sleeping

App Files Files Community

ArthurSrz commited on Oct 19, 2025

Commit

8a86da4

1 Parent(s): 0a69846

Add Borges Graph app with GraphRAG data

Browse files

Files changed (8) hide show

README.md +61 -5
a_rebours_huysmans/graph_chunk_entity_relation.graphml +0 -0
a_rebours_huysmans/kv_store_community_reports.json +0 -0
a_rebours_huysmans/kv_store_full_docs.json +0 -0
a_rebours_huysmans/kv_store_llm_response_cache.json +0 -0
a_rebours_huysmans/kv_store_text_chunks.json +0 -0
app.py +347 -0
requirements.txt +7 -0

README.md CHANGED Viewed

@@ -1,13 +1,69 @@
 ---
 title: Borges Graph
-emoji: 🌍
 colorFrom: yellow
-colorTo: green
 sdk: gradio
-sdk_version: 5.49.1
 app_file: app.py
 pinned: false
-short_description: Hosting of Borges graphRAG mechanism
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: Borges Graph
+emoji: 📚
 colorFrom: yellow
+colorTo: orange
 sdk: gradio
+sdk_version: 4.44.0
 app_file: app.py
 pinned: false
+license: mit
+short_description: GraphRAG Explorer for Borgesian Literature Analysis
 ---
+# Borges Graph - GraphRAG Explorer
+Une interface intelligente pour explorer la littérature avec GraphRAG. Basé sur nano-graphrag, cette application permet de poser des questions en langage naturel sur des œuvres littéraires et visualise le processus de recherche dans le graphe de connaissances.
+## 🌟 Fonctionnalités
+- **Recherche sémantique** : Posez vos questions en français
+- **Analyse GraphRAG** : Utilise nano-graphrag pour explorer les connexions
+- **Interface Gradio** : Interface web intuitive
+- **API intégrée** : Endpoint pour intégrations externes
+- **Mode démo** : Fonctionne même sans données GraphRAG
+## 🚀 Utilisation
+### Interface Web
+1. Tapez votre question dans le champ de recherche
+2. Choisissez le mode (Local ou Global)
+3. Cliquez sur "Explorer le graphe"
+4. Découvrez la réponse et l'analyse du parcours
+### API
+L'application expose automatiquement une API Gradio accessible via :
+```
+POST /api/predict
+```
+## 📖 Questions d'exemple
+- "Quels sont les thèmes principaux de cette œuvre ?"
+- "Parle-moi des personnages"
+- "Comment les concepts sont-ils interconnectés ?"
+- "Quelle est la structure narrative ?"
+## 🛠 Architecture
+- **nano-graphrag** : Moteur de recherche GraphRAG
+- **Gradio** : Interface utilisateur et API
+- **OpenAI** : Modèles de langage pour l'analyse
+- **NetworkX** : Gestion des graphes de connaissances
+## 📊 Données
+Cette application peut travailler avec des données GraphRAG pré-générées. Les fichiers de données doivent être organisés dans des dossiers contenant `graph_chunk_entity_relation.graphml`.
+## 🎯 Intégration
+Cette API peut être intégrée dans d'autres applications, notamment :
+- Applications web Vercel/Next.js
+- Interfaces de visualisation de graphes
+- Outils d'analyse littéraire
+## 🔗 Liens
+- [nano-graphrag](https://github.com/gusye1234/nano-graphrag)
+- [Gradio](https://gradio.app)
+- [Hugging Face Spaces](https://huggingface.co/spaces)

a_rebours_huysmans/graph_chunk_entity_relation.graphml ADDED Viewed

The diff for this file is too large to render. See raw diff

a_rebours_huysmans/kv_store_community_reports.json ADDED Viewed

The diff for this file is too large to render. See raw diff

a_rebours_huysmans/kv_store_full_docs.json ADDED Viewed

The diff for this file is too large to render. See raw diff

a_rebours_huysmans/kv_store_llm_response_cache.json ADDED Viewed

The diff for this file is too large to render. See raw diff

a_rebours_huysmans/kv_store_text_chunks.json ADDED Viewed

The diff for this file is too large to render. See raw diff

app.py ADDED Viewed

	@@ -0,0 +1,347 @@

+import gradio as gr
+import json
+import re
+import os
+import asyncio
+from pathlib import Path
+from typing import Dict, Any, List
+import tempfile
+import shutil
+# Try to import nano_graphrag, with fallback for demo
+try:
+    from nano_graphrag import GraphRAG, QueryParam
+    from nano_graphrag._llm import gpt_4o_mini_complete
+    NANO_GRAPHRAG_AVAILABLE = True
+except ImportError:
+    NANO_GRAPHRAG_AVAILABLE = False
+    print("⚠️ nano-graphrag not available, running in demo mode")
+class BorgesGraphRAG:
+    def __init__(self):
+        self.instances = {}
+        self.current_book = None
+    def load_book_data(self, book_folder: str):
+        """Load GraphRAG data for a specific book"""
+        if not NANO_GRAPHRAG_AVAILABLE:
+            return False
+        try:
+            if book_folder not in self.instances:
+                self.instances[book_folder] = GraphRAG(
+                    working_dir=book_folder,
+                    best_model_func=gpt_4o_mini_complete,
+                    cheap_model_func=gpt_4o_mini_complete,
+                    best_model_max_async=3,
+                    cheap_model_max_async=3
+                )
+            self.current_book = book_folder
+            return True
+        except Exception as e:
+            print(f"Error loading book data: {e}")
+            return False
+    def parse_context_csv(self, context_str: str):
+        """Parse the CSV context returned by GraphRAG"""
+        entities = []
+        relations = []
+        # Parse entities section
+        entities_match = re.search(r'-----Entities-----\n```csv\n(.*?)\n```', context_str, re.DOTALL)
+        if entities_match:
+            lines = entities_match.group(1).strip().split('\n')
+            for line in lines[1:]:  # Skip header
+                parts = line.split(',')
+                if len(parts) >= 5:
+                    entities.append({
+                        'id': parts[1].strip(),
+                        'type': parts[2].strip(),
+                        'description': ','.join(parts[3:-1]).strip(),
+                        'rank': float(parts[-1]) if parts[-1].strip() else 0
+                    })
+        # Parse relationships section
+        relations_match = re.search(r'-----Relationships-----\n```csv\n(.*?)\n```', context_str, re.DOTALL)
+        if relations_match:
+            lines = relations_match.group(1).strip().split('\n')
+            for line in lines[1:]:  # Skip header
+                parts = line.split(',')
+                if len(parts) >= 6:
+                    relations.append({
+                        'source': parts[1].strip(),
+                        'target': parts[2].strip(),
+                        'description': ','.join(parts[3:-2]).strip(),
+                        'weight': float(parts[-2]) if parts[-2].strip() else 1,
+                        'rank': float(parts[-1]) if parts[-1].strip() else 0
+                    })
+        return entities, relations
+    async def query_book(self, query: str, mode: str = "local") -> Dict[str, Any]:
+        """Query the current book with GraphRAG"""
+        if not NANO_GRAPHRAG_AVAILABLE or not self.current_book:
+            return self.get_demo_response(query)
+        try:
+            graph_instance = self.instances[self.current_book]
+            # Get context with details
+            context_param = QueryParam(mode=mode, only_need_context=True, top_k=20)
+            context = await graph_instance.aquery(query, param=context_param)
+            # Get actual answer
+            answer_param = QueryParam(mode=mode, top_k=20)
+            answer = await graph_instance.aquery(query, param=answer_param)
+            # Parse context
+            entities, relations = self.parse_context_csv(context)
+            return {
+                "success": True,
+                "answer": answer,
+                "searchPath": {
+                    "entities": [
+                        {**e, "order": i+1, "score": 1.0 - (i * 0.05)}
+                        for i, e in enumerate(entities[:15])
+                    ],
+                    "relations": [
+                        {**r, "traversalOrder": i+1}
+                        for i, r in enumerate(relations[:20])
+                    ],
+                    "communities": [
+                        {"id": "community_1", "content": "Cluster thématique principal", "relevance": 0.9}
+                    ]
+                },
+                "book_id": self.current_book,
+                "mode": mode,
+                "query": query
+            }
+        except Exception as e:
+            return {
+                "success": False,
+                "error": str(e),
+                "fallback": self.get_demo_response(query)
+            }
+    def get_demo_response(self, query: str) -> Dict[str, Any]:
+        """Demo response when GraphRAG is not available"""
+        query_lower = query.lower()
+        if "thème" in query_lower or "theme" in query_lower:
+            answer = """Les thèmes principaux de cette œuvre borgésienne incluent:
+**1. Le Labyrinthe de la Connaissance**
+La bibliothèque infinie comme métaphore de l'univers et de la quête du savoir.
+**2. L'Identité et le Double**
+L'exploration de la nature fragmentée de l'identité humaine.
+**3. Le Temps Cyclique**
+La répétition éternelle et la nature circulaire de l'existence.
+**4. La Réalité et la Fiction**
+Les frontières floues entre le réel et l'imaginaire."""
+            entities = ["LABYRINTHE", "BIBLIOTHÈQUE", "IDENTITÉ", "TEMPS", "RÉALITÉ"]
+            relations = [
+                {"source": "LABYRINTHE", "target": "BIBLIOTHÈQUE"},
+                {"source": "IDENTITÉ", "target": "RÉALITÉ"},
+                {"source": "TEMPS", "target": "LABYRINTHE"}
+            ]
+        else:
+            answer = f"Analyse de votre question: '{query}'\n\nD'après l'exploration du graphe de connaissances, cette question touche aux concepts fondamentaux de l'univers borgésien. Les connexions révèlent une architecture narrative complexe où chaque élément participe d'un réseau de significations multiples."
+            entities = ["QUESTION", "CONNAISSANCE", "RÉSEAU", "ANALYSE"]
+            relations = [{"source": "QUESTION", "target": "CONNAISSANCE"}]
+        return {
+            "success": True,
+            "answer": answer,
+            "searchPath": {
+                "entities": [
+                    {
+                        "id": entity,
+                        "type": "CONCEPT",
+                        "description": f"{entity} - Concept clé de l'œuvre",
+                        "rank": 1,
+                        "order": i+1,
+                        "score": 0.9 - (i * 0.1)
+                    }
+                    for i, entity in enumerate(entities)
+                ],
+                "relations": [
+                    {
+                        **rel,
+                        "description": f"Relation entre {rel['source']} et {rel['target']}",
+                        "weight": 1,
+                        "rank": 1,
+                        "traversalOrder": i+1
+                    }
+                    for i, rel in enumerate(relations)
+                ],
+                "communities": [
+                    {"id": "demo_community", "content": "Cluster thématique (mode démo)", "relevance": 0.8}
+                ]
+            },
+            "book_id": "demo_book",
+            "mode": "demo",
+            "query": query
+        }
+# Initialize GraphRAG instance
+borges_rag = BorgesGraphRAG()
+# Check for available book data
+available_books = []
+for item in os.listdir('.'):
+    if os.path.isdir(item) and not item.startswith('.'):
+        graph_file = os.path.join(item, 'graph_chunk_entity_relation.graphml')
+        if os.path.exists(graph_file):
+            available_books.append(item)
+if available_books:
+    default_book = available_books[0]
+    borges_rag.load_book_data(default_book)
+    book_status = f"✅ Livre chargé: {default_book}"
+else:
+    book_status = "⚠️ Mode démo - Aucune donnée GraphRAG trouvée"
+async def process_query(query: str, mode: str) -> tuple:
+    """Process a query and return formatted results"""
+    if not query.strip():
+        return "❌ Veuillez entrer une question", "{}", ""
+    try:
+        result = await borges_rag.query_book(query, mode.lower())
+        if result.get("success"):
+            # Format the answer
+            answer = result["answer"]
+            # Format search path info
+            search_info = result["searchPath"]
+            entities_count = len(search_info["entities"])
+            relations_count = len(search_info["relations"])
+            # Create summary
+            summary = f"""
+📊 **Analyse de la traversée du graphe:**
+• {entities_count} entités identifiées
+• {relations_count} relations explorées
+• Mode: {result.get('mode', 'demo')}
+• Livre: {result.get('book_id', 'demo')}
+"""
+            # JSON for API
+            json_result = json.dumps(result, indent=2, ensure_ascii=False)
+            return answer, json_result, summary
+        else:
+            error_msg = result.get("error", "Erreur inconnue")
+            return f"❌ Erreur: {error_msg}", "{}", ""
+    except Exception as e:
+        return f"❌ Exception: {str(e)}", "{}", ""
+# Gradio interface
+def query_interface(query: str, mode: str):
+    """Sync wrapper for async query processing"""
+    loop = asyncio.new_event_loop()
+    asyncio.set_event_loop(loop)
+    try:
+        return loop.run_until_complete(process_query(query, mode))
+    finally:
+        loop.close()
+# API endpoint for external calls
+def api_query(query: str, mode: str = "local", book_id: str = None):
+    """API endpoint that returns JSON response"""
+    loop = asyncio.new_event_loop()
+    asyncio.set_event_loop(loop)
+    try:
+        result = loop.run_until_complete(borges_rag.query_book(query, mode))
+        return result
+    finally:
+        loop.close()
+# Gradio app
+with gr.Blocks(
+    title="Borges Graph - GraphRAG Explorer",
+    theme=gr.themes.Soft(primary_hue="amber"),
+    css="""
+    .gradio-container {
+        font-family: 'Georgia', serif;
+        background: linear-gradient(135deg, #1a1a1a 0%, #2d2d2d 100%);
+        color: #d4af37;
+    }
+    .gr-button-primary {
+        background: linear-gradient(135deg, #d4af37 0%, #b8941f 100%);
+        border: none;
+    }
+    """
+) as app:
+    gr.Markdown("""
+    # 📚 Borges Graph - GraphRAG Explorer
+    Explorez la bibliothèque infinie avec l'intelligence artificielle. Posez vos questions en langage naturel et découvrez les connexions secrètes dans l'univers borgésien.
+    """)
+    gr.Markdown(f"**Statut:** {book_status}")
+    with gr.Row():
+        with gr.Column(scale=2):
+            query_input = gr.Textbox(
+                label="🔍 Votre question",
+                placeholder="Quels sont les thèmes principaux de cette œuvre ?",
+                lines=2
+            )
+            mode_select = gr.Radio(
+                choices=["Local", "Global"],
+                value="Local",
+                label="Mode de recherche",
+                info="Local: recherche focalisée | Global: vue d'ensemble"
+            )
+            search_btn = gr.Button("🚀 Explorer le graphe", variant="primary")
+        with gr.Column(scale=1):
+            gr.Markdown("""
+            ### 💡 Questions suggérées:
+            - Quels sont les thèmes principaux ?
+            - Parle-moi des personnages
+            - Quelle est la structure narrative ?
+            - Comment les concepts sont-ils liés ?
+            """)
+    with gr.Row():
+        with gr.Column():
+            answer_output = gr.Markdown(label="📖 Réponse")
+            summary_output = gr.Markdown(label="📊 Résumé de l'analyse")
+    with gr.Accordion("🔧 Réponse JSON (pour développeurs)", open=False):
+        json_output = gr.Code(language="json", label="JSON Response")
+    # Event handlers
+    search_btn.click(
+        fn=query_interface,
+        inputs=[query_input, mode_select],
+        outputs=[answer_output, json_output, summary_output]
+    )
+    query_input.submit(
+        fn=query_interface,
+        inputs=[query_input, mode_select],
+        outputs=[answer_output, json_output, summary_output]
+    )
+# Launch the app
+if __name__ == "__main__":
+    app.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False
+    )

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+gradio>=4.0.0
+nano-graphrag
+openai>=1.0.0
+networkx>=3.0
+numpy>=1.21.0
+tiktoken>=0.4.0
+aiohttp>=3.8.0