| | --- |
| | language: |
| | - en |
| | license: mit |
| | pipeline_tag: sentence-similarity |
| | library_name: sentence-transformers |
| | base_model: Supabase/gte-small |
| | tags: |
| | - sentence-transformers |
| | - embeddings |
| | - semantic-search |
| | - compact |
| | - serverless |
| | - neurosense |
| | --- |
| | |
| | # Neurosense Compact GTE R1 |
| |
|
| | Compact embedding model for semantic retrieval, designed to stay under 200MB while improving retrieval quality over the previous Neurosense checkpoint. |
| |
|
| | ## Model Summary |
| |
|
| | - Base model: `Supabase/gte-small` |
| | - Embedding dimension: `384` |
| | - Max sequence length: `512` |
| | - Checkpoint size (local): ~`128MB` |
| | - Intended use: dense retrieval / semantic search |
| |
|
| | ## Benchmark (Retrieval-Only, No Generation) |
| |
|
| | Evaluated on fixed 150-case benchmark (50 queries each from `mteb/fiqa`, `mteb/nfcorpus`, `mteb/scifact`) using cosine similarity. |
| |
|
| | ### Aggregate Results |
| |
|
| | - `hit@1`: `0.6600` |
| | - `hit@5`: `0.8133` |
| | - `mrr@10`: `0.7311` |
| | - `recall@10`: `0.6212` |
| | - `map@10`: `0.5859` |
| |
|
| | ### Comparison vs previous HF Neurosense |
| |
|
| | Previous Neurosense baseline (`models/Neurosense`): |
| | - `mrr@10`: `0.7242` |
| |
|
| | This compact checkpoint: |
| | - `mrr@10`: `0.7311` |
| |
|
| | Delta: |
| | - `+0.0069` MRR@10 on the same benchmark. |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from sentence_transformers import SentenceTransformer |
| | |
| | model = SentenceTransformer("Sharjeelbaig/Neurosense-Compact-GTE-R1") |
| | embeddings = model.encode([ |
| | "what causes daytime sleepiness", |
| | "Sleep apnea can cause fragmented sleep and daytime fatigue" |
| | ], normalize_embeddings=True) |
| | ``` |
| |
|
| | ## Notes |
| |
|
| | - This model is compact and practical for serverless constraints. |
| | - It does not aim to replace very large embedding models globally; it is optimized for compact deployment with strong retrieval quality. |
| |
|