Spaces:
Runtime error
Runtime error
| # metadata | |
| title: semmyKG - Knowledge Graph visualiser toolkit (builder from markdown) | |
| emoji: 🕸️ | |
| colorFrom: yellow | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 5.44.1 | |
| python_version: 3.12 | |
| #command: python app_gradio_lightrag.py | |
| app_file: app.py #app_gradio_lightrag.py | |
| hf_oauth: true | |
| oauth_scopes: [read-access] | |
| hf_oauth_scopes: [inference-api] | |
| license: mit | |
| pinned: true | |
| short_description: semmyKG - Knowledge Graph toolkit | | |
| #models: [meta-llama/Llama-4-Maverick-17B-128E-Instruct, openai/gpt-oss-120b, openai/gpt-oss-20b, ] | |
| models: | |
| - meta-llama/Llama-4-Maverick-17B-128E-Instruct | |
| - openai/gpt-oss-120b, openai/gpt-oss-20b | |
| tags: [knowledge graph, markdown, RAG, domain] | |
| #preload_from_hub: [https://huggingface.co/datalab-to/surya_layout, https://huggingface.co/datalab-to/surya_tablerec, huggingface.co/datalab-to/line_detector0, https://huggingface.co/tarun-menta/ocr_error_detection/blob/main/config.json] | |
| owner: research-semmyk | |
| #--- | |
| #[Project] | |
| #--- | |
| #short_description: PDF & HTML parser to markdown | |
| version: 0.2.8.6 | |
| readme: README.md | |
| requires-python: ">=3.12" | |
| #dependencies: [] | |
| #--- | |
| # LightRAG Gradio App | |
| A modern, modular Gradio app for knowledge graph-based Retrieval-Augmented Generation (RAG) using [LightRAG][1]. Supports OpenAI and Ollama LLM backends, markdown document ingestion, and interactive knowledge graph visualisation. Our ParserPDF ([GitHub][3] | [HF Space][4]) pipeline generate markdown from documents (pdf, Word, html). | |
| ## Features | |
| - LightRAG for Dual-level RAG and knowledge graph (KG) | |
| - Ingest markdown files from a folder (default: `dataset/data/docs`). | |
| - Query with OpenAI or Ollama backend (user-selectable) | |
| - Visualise KG interactively in-browser | |
| - Deployable to venv, Colab, or HuggingFace Spaces | |
| - Robust, pythonic, modular code (UK English) | |
| ## Setup | |
| ### 1. Clone and create venv | |
| ```bash | |
| git clone https://github.com/semmyk-research/semmyKG | |
| cd semmyKG | |
| uv venv .venv # ensure you have the uv package | |
| source .venv/bin/activate # or .venv\Scripts\activate on Windows | |
| uv pip sync # or uv pip sync requirements.txt | |
| or | |
| python -m venv .venv | |
| source .venv/bin/activate # or .venv\Scripts\activate on Windows | |
| pip install -r requirements.txt | |
| ``` | |
| ### 2. Configure environment | |
| Copy `.env.example` to `.env` and fill in your keys: | |
| ```markdown | |
| OPENAI_API_KEY=your-openai-api-key | |
| LLM_MODEL=your-LLM-model-Name | |
| ##(in the format: provider/model-identifier) | |
| OPENAI_API_BASE=your-LLM-inference-provider-endpoint | |
| ##(for locally hosted llm inference server like LMStudio or Jan.ai, follow ollama host adding /v1: http://localhost:1234/v1) | |
| OPENAI_API_EMBED_BASE=your-embedding-provider-endpoint | |
| ##(for locally hosted, do not include /embedding) | |
| LLM_MODEL_EMBED=your-embedding-model ##(in the format: provider/embedding-name) | |
| OLLAMA_HOST=http://localhost:11434 | |
| OLLAMA_API_KEY= ##(include if required) | |
| ``` | |
| If .env is not set, you can enter into the web UI directly. <br> | |
| Ditto, override .env by inputting directly in web UI. | |
| ### 3. Run the app | |
| ```bash | |
| python app_gradio_lightrag.py | |
| ``` | |
| For 'faster' development 'debug' | |
| ```python | |
| ##SMY: assist: https://www.gradio.app/guides/developing-faster-with-reload-mode | |
| gradio app_gradio_lightrag.py --demo-name=gradio_ui | |
| ``` | |
| ### 4. Colab/Spaces | |
| - For HuggingFace Spaces: ensure all dependencies are in `requirements.txt` and `.env` is set via the web UI or Space secret. | |
| - For Colab: install requirements and run the app cell. | |
| ## Usage | |
| - Browse/Select your data folder (default: `dataset/data/docs`) | |
| - Choose LLM backend (OpenAI or Ollama). [fix: GenAI has a bug yieling error: role:'assistant' instead of 'user' when updating history]. | |
| - Activate the RAG constructor | |
| - Click 'Index Documents' to build the KG entities | |
| - Click 'Query' to get answers | |
| -- Enter your query and select query mode | |
| - Click 'Show Knowledge Graph' to visualise the KG | |
| NB: If using HuggingFace, log in first before browsing/selecting/uploading files and setting LLM parameters. | |
| ## Notes | |
| - Only markdown files are supported for ingestion (images in `/images` subfolder are ignored for now). <br>NB: other formats will be enabled later: pdf, txt, html... | |
| - To generate markdown from documents (PDf, Word, html), use our ParserPDF tool [GitHub][3] | [HF Space][4]. | |
| - All user-facing text is in UK English | |
| - For advanced configuration, see LightRAG documentation | |
| ## Roadmap (no defined timeline) | |
| - HuggingFace log in | |
| - [ParserPDF][3] integration | |
| ## License | |
| [MIT][2] | |
| [1]: https://github.com/HKUDS/LightRAG "LightRAG GitHub" | |
| [2]: https://opensource.org/license/mit "MIT License" | |
| [3]: https://github.com/semmyk-research/parserPDF "ParserPDF (GitHub)" | |
| [4]: https://huggingface.co/spaces/semmyk/parserPDF "ParserPDF (HF Space)" |