howard-hou
/

EmbeddingRWKV

Model card Files Files and versions

howard-hou commited on Dec 2, 2025

Commit

443b183

·

verified ·

1 Parent(s): 87cf70d

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -144,7 +144,7 @@ for doc, score in zip(documents, scores):
 ```
 ### Offline Mode (Cached Doc State)
-For scenarios where documents are static but queries change (e.g., Search Engines, RAG), you can **pre-compute and cache the document states**. This reduces query-time latency from $O(L_{doc} + L_{query})$ to just $O(L_{query})$.
 #### Workflow
@@ -218,7 +218,7 @@ for doc, score in zip(documents, scores):
 | Feature | 1. Embedding (Cosine) | 2. Online Reranking | 3. Offline Reranking |
 | :--- | :--- | :--- | :--- |
 | **Accuracy** | Good | **Best** | **Best** (Identical to Online) |
-| **Latency** | Extremely Fast | Slow ($L_{doc} + L_{query}$) | Fast ($L_{query}$ only) |
 | **Input** | Query & Doc separate | `Instruct + Doc + Query` | `Query` (on top of cached Doc) |
 | **Storage** | Low (Vector only) | None | High (Stores Hidden States) |
 | **Best For** | Initial Retrieval (Top-k) | Reranking few candidates | Reranking many candidates |

 ```
 ### Offline Mode (Cached Doc State)
+For scenarios where documents are static but queries change (e.g., Search Engines, RAG), you can **pre-compute and cache the document states**. This reduces query-time latency from O(L_doc + L_query) to just O(L_query).
 #### Workflow
 | Feature | 1. Embedding (Cosine) | 2. Online Reranking | 3. Offline Reranking |
 | :--- | :--- | :--- | :--- |
 | **Accuracy** | Good | **Best** | **Best** (Identical to Online) |
+| **Latency** | Extremely Fast | Slow O(L_doc + L_query) | Fast O(L_query) only |
 | **Input** | Query & Doc separate | `Instruct + Doc + Query` | `Query` (on top of cached Doc) |
 | **Storage** | Low (Vector only) | None | High (Stores Hidden States) |
 | **Best For** | Initial Retrieval (Top-k) | Reranking few candidates | Reranking many candidates |