metadata
title: Code Knowledge Graph Explorer β π€ Transformers Library
emoji: π
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
tags:
- building-mcp-track-enterprise
Knowledge Graph MCP Explorer
This is a Gradio-based interactive tool for exploring code repository knowledge graphs. It provides a web interface to search, navigate, and analyze code relationships using the Model Context Protocol (MCP).
Features
- Search Nodes: Search for code entities, functions, classes, and more using semantic search
- Graph Navigation: Explore relationships between code elements
- Entity Tracking: View declared and called entities within code chunks
- Path Finding: Find paths between different nodes in the knowledge graph
- Subgraph Extraction: Extract and visualize subgraphs around specific nodes
- File Structure: View the hierarchical structure of the repository
Usage
The application loads a pre-built knowledge graph from the HuggingFace Transformers repository. You can:
- Search: Use the search tab to find relevant code snippets and entities
- Explore: Navigate through the graph using node IDs
- Analyze: Get statistics about the code structure and relationships
Technical Details
- Built with Gradio for the web interface
- Uses LanceDB for efficient code indexing and search
- Supports hybrid search (keyword + semantic embeddings)
- Pre-computed embeddings using Salesforce/SFR-Embedding-Code-400M_R model
Data Sources
The application supports loading knowledge graphs from:
1. HuggingFace Hub Dataset (Recommended)
Load directly from a HuggingFace dataset:
python gradio_mcp.py --host 0.0.0.0 --port 7860 --hf-dataset "username/dataset-name"
2. Local JSON File
Use a local JSON file (e.g., multihop_knowledge_graph_with_embeddings.json):
python gradio_mcp.py --host 0.0.0.0 --port 7860 --graph-file data/multihop_knowledge_graph_with_embeddings.json
Creating and Publishing a Dataset
You can save an existing knowledge graph to HuggingFace Hub:
from RepoKnowledgeGraphLib import RepoKnowledgeGraph
# Load from local file
kg = RepoKnowledgeGraph.load("path/to/graph.json")
# Push to HuggingFace Hub (without embeddings to reduce size)
kg.to_hf_dataset("username/my-knowledge-graph", save_embeddings=False, private=False)
# Or with embeddings (larger dataset)
kg.to_hf_dataset("username/my-knowledge-graph-with-embeddings", save_embeddings=True)
Docker Configuration
The default Dockerfile uses a local JSON file. To use HuggingFace datasets instead, modify the CMD line in Dockerfile:
# Using HuggingFace dataset (recommended for smaller Docker image)
CMD ["python", "-u", "gradio_mcp.py", "--host", "0.0.0.0", "--port", "7860", "--hf-dataset", "username/dataset-name"]
# Using local file (requires large data file in image)
CMD ["python", "-u", "gradio_mcp.py", "--host", "0.0.0.0", "--port", "7860", "--graph-file", "/app/data/multihop_knowledge_graph_with_embeddings.json"]
Local Development
To run locally:
docker build -t gradio-mcp-space .
docker run -p 7860:7860 gradio-mcp-space
Or without Docker:
pip install -r requirements.txt
python gradio_mcp.py --host 0.0.0.0 --port 7860 --hf-dataset "username/dataset-name"
Deployment to HuggingFace Spaces
Option 1: Using HuggingFace Dataset (Recommended)
- First, push your knowledge graph to a HuggingFace dataset
- Update the Dockerfile CMD to use
--hf-dataset - Push to the Space repository (no large files needed)
Option 2: Using Local JSON File
- Create a new Space on HuggingFace with Docker SDK
- Enable Git LFS in your Space repository
- Push this directory to the Space repository:
git lfs install git lfs track "data/*.json" git add . git commit -m "Initial commit" git push
π₯ Team
Team Name: CEPIA Ionis Team
Team Members:
- Laila ELKOUSSY - @lailaelkoussy - Research Engineer, Data Scientist
- Julien PEREZ - @jnm38 -