MindscapeRAG
/

MiA-Emb-8B

@@ -1,10 +1,12 @@
 ---
-license: apache-2.0
 language:
 - en
 - zh
-base_model:
-- Qwen/Qwen3-Embedding-8B
 tags:
 - embedding
 - retriever
@@ -16,7 +18,7 @@ tags:
 [![Paper](https://img.shields.io/badge/Paper-arXiv%3A2512.17220-red)](https://arxiv.org/pdf/2512.17220)
 [![Model](https://img.shields.io/badge/HuggingFace-MiA--Emb--8B-yellow)](https://huggingface.co/MindscapeRAG/MiA-Emb-8B)
-This repository provides the inference implementation for **MiA-Emb (Mindscape-Aware Embedding)**, the retriever component in the **MiA-RAG** framework.
 **MiA-RAG** introduces explicit **global context awareness** via a **Mindscape**—a document-level semantic scaffold constructed by **hierarchical summarization**. By conditioning **both retrieval and generation** on the same Mindscape, MiA-RAG enables globally grounded retrieval and more coherent long-context reasoning.
@@ -56,7 +58,7 @@ pip install torch transformers>=4.53.0
 ### 1) Initialization
-> MiA-Emb-8B is initialized from **`Qwen3-Embedding-8B`**.
 ```python
 import torch
@@ -99,13 +101,17 @@ Use this mode to retrieve narrative text chunks. A **Global Summary** is injecte
 def get_query_prompt(query, summary="", residual=False):
     """Construct input prompt with global summary (Eq. 5 in paper)."""
     task_desc = "Given a search query with the book's summary, retrieve relevant chunks or helpful entities summaries from the given context that answer the query"
-    summary_prefix = "\n\nHere is the summary providing possibly useful global information. Please encode the query based on the summary:\n"
     # Insert PAD token to capture residual embedding before the summary
     middle_token = tokenizer.pad_token if residual else ""
     return (
-        f"Instruct: {task_desc}\n"
         f"Query: {query}{middle_token}{summary_prefix}{summary}{node_delimiter}"
     )
@@ -210,8 +216,6 @@ print(f"Node Similarity: {final_score.item():.4f}")
 ## 📜 Citation
 If you find this work useful, please cite:
 ```bibtex

 ---
+base_model:
+- Qwen/Qwen3-Embedding-8B
+library_name: transformers
+pipeline_tag: feature-extraction
 language:
 - en
 - zh
+license: apache-2.0
 tags:
 - embedding
 - retriever
 [![Paper](https://img.shields.io/badge/Paper-arXiv%3A2512.17220-red)](https://arxiv.org/pdf/2512.17220)
 [![Model](https://img.shields.io/badge/HuggingFace-MiA--Emb--8B-yellow)](https://huggingface.co/MindscapeRAG/MiA-Emb-8B)
+This repository provides the inference implementation for **MiA-Emb (Mindscape-Aware Embedding)**, the retriever component in the **MiA-RAG** framework, as presented in the paper [Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding](https://huggingface.co/papers/2512.17220).
 **MiA-RAG** introduces explicit **global context awareness** via a **Mindscape**—a document-level semantic scaffold constructed by **hierarchical summarization**. By conditioning **both retrieval and generation** on the same Mindscape, MiA-RAG enables globally grounded retrieval and more coherent long-context reasoning.
 ### 1) Initialization
+> MiA-Emb-8B is a LoRA adapter initialized from **`Qwen/Qwen3-Embedding-8B`**.
 ```python
 import torch
 def get_query_prompt(query, summary="", residual=False):
     """Construct input prompt with global summary (Eq. 5 in paper)."""
     task_desc = "Given a search query with the book's summary, retrieve relevant chunks or helpful entities summaries from the given context that answer the query"
+    summary_prefix = "
+Here is the summary providing possibly useful global information. Please encode the query based on the summary:
+"
     # Insert PAD token to capture residual embedding before the summary
     middle_token = tokenizer.pad_token if residual else ""
     return (
+        f"Instruct: {task_desc}
+"
         f"Query: {query}{middle_token}{summary_prefix}{summary}{node_delimiter}"
     )
 ## 📜 Citation
 If you find this work useful, please cite:
 ```bibtex