SixOpen commited on
Commit
b5ceec6
·
verified ·
1 Parent(s): f304ad1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -4
README.md CHANGED
@@ -20,6 +20,10 @@ base_model: Alibaba-NLP/gte-modernbert-base
20
 
21
  TL;DR: Stateful embedding model that replaces sliding-window attention with RWKV recurrence, allowing for incremental encoding and streaming semantic search.
22
 
 
 
 
 
23
  ![image](https://cdn-uploads.huggingface.co/production/uploads/65f47dc77874f3874523c628/GFqHaFy1fplauCi2mkm7M.png)
24
 
25
  Conventional embedding models are stateless: adding new content requires re-encoding from scratch because token representations depend on the entire sequence.
@@ -28,9 +32,6 @@ Each recurrent layer maintains a fixed-size state matrix that summarizes all pri
28
 
29
  Essentially, the biggest advantage is being able to perform semantic search on large files way before they're 100% available - and across multiple streams simultaneously (for example parallel distributed files, concurrent transcripts, documents arriving from different sources on the same topic)
30
 
31
- Demo:
32
- [![Try in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-sm.svg)](https://huggingface.co/spaces/SixOpen/HARE)
33
-
34
  ## Results
35
 
36
  ### LongEmbed (Needle/Passkey: nDCG@1; others: nDCG@10)
@@ -243,7 +244,9 @@ Three-stage pipeline:
243
  | `tokenizer.json` | Tokenizer |
244
  | `tokenizer_config.json` | Tokenizer config |
245
  | `surgery.py` | Standalone surgery CLI tool (inspect layers, perform surgery from scratch) |
246
- | `birwkv7.py` | BiRWKV-7 recurrence layer (required for loading) |
 
 
247
  | `streaming.py` | SpanEncoder for stateful incremental encoding |
248
 
249
  ## Intended uses
@@ -252,6 +255,11 @@ Three-stage pipeline:
252
  - Incremental indexing where text arrives sequentially and must be searchable before completion: live transcription, real-time meeting/dispatch/etc indexing, distributed (ie torrent) content search, incremental document editing
253
  - Multi-vector retrieval with chunk-level or token-level scoring
254
 
 
 
 
 
 
255
 
256
  ## Citation
257
 
 
20
 
21
  TL;DR: Stateful embedding model that replaces sliding-window attention with RWKV recurrence, allowing for incremental encoding and streaming semantic search.
22
 
23
+ Live Demo:
24
+ [![Try in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-sm.svg)](https://huggingface.co/spaces/SixOpen/HARE)
25
+
26
+
27
  ![image](https://cdn-uploads.huggingface.co/production/uploads/65f47dc77874f3874523c628/GFqHaFy1fplauCi2mkm7M.png)
28
 
29
  Conventional embedding models are stateless: adding new content requires re-encoding from scratch because token representations depend on the entire sequence.
 
32
 
33
  Essentially, the biggest advantage is being able to perform semantic search on large files way before they're 100% available - and across multiple streams simultaneously (for example parallel distributed files, concurrent transcripts, documents arriving from different sources on the same topic)
34
 
 
 
 
35
  ## Results
36
 
37
  ### LongEmbed (Needle/Passkey: nDCG@1; others: nDCG@10)
 
244
  | `tokenizer.json` | Tokenizer |
245
  | `tokenizer_config.json` | Tokenizer config |
246
  | `surgery.py` | Standalone surgery CLI tool (inspect layers, perform surgery from scratch) |
247
+ | `birwkv7.py` | BiRWKV-7 recurrence layer /w Triton Kernel (required for loading) |
248
+ | `modeling_hare.py` | Model wrapper |
249
+ | `configuration_hare.py` | Config class |
250
  | `streaming.py` | SpanEncoder for stateful incremental encoding |
251
 
252
  ## Intended uses
 
255
  - Incremental indexing where text arrives sequentially and must be searchable before completion: live transcription, real-time meeting/dispatch/etc indexing, distributed (ie torrent) content search, incremental document editing
256
  - Multi-vector retrieval with chunk-level or token-level scoring
257
 
258
+ ## Limitations
259
+
260
+ - This is a research-grade model - although some numbers indicate long ctx sota on specific categories, it could benefit from seeing more diverse data during training as shown by the scores on legal case reports and stackoverflow above.
261
+ - Asymmetric streaming context - streaming mode uses forward (left-to-right) state carry, which accumulates full left context incrementally; the backward scan only sees within each piece, so right context is local
262
+
263
 
264
  ## Citation
265