File size: 1,759 Bytes
f09efc3 facd25a e1df413 f09efc3 c0216ea 6c8e857 d975cb7 6c8e857 c0216ea d975cb7 f09efc3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | ---
license: apache-2.0
library_name: pytorch
tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
- mteb
- hcae
---
# HCAE-21M (Hybrid Convolutional-Attention Encoder)
**HCAE-21M** is a mid-scale (21 Million parameters) text embedding model combining Depthwise Separable Convolutions and Self-Attention layers. It achieves high performance on Semantic Textual Similarity and Retrieval tasks while remaining extremely memory-efficient.
<img src="https://cdn-uploads.huggingface.co/production/uploads/680c9127408ea47e6c1dd6e8/0KKsVpqg2Id01nxh8zRjO.png" width="400">
## Architecture Description
- **Size:** ~21M parameters (d_model=384)
- **Lower Layers:** 5 layers of Depthwise Separable Conv1d + FFN.
- **Upper Layers:** 3 layers of Multihead Self-Attention.
- **Pooling Strategy:** Global Mean Pooling.
## Benchmark Comparison (MTEB)
This table delineates the performance disparities between architectural iterations:
| Model Revision | STSBenchmark (Spearman) | SciFact (Recall@10) | Description |
|---|---|---|---|
| **HCAE-21M-Base** | `0.507` | `0.324` | Baseline configuration trained extensively on the MS MARCO dataset. |
| **HCAE-21M-Instruct** | `0.591` | `0.393` | Multi-stage tuning incorporating ArXiv, STS-B, and SQuAD instruction tuning paradigms. |
<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/680c9127408ea47e6c1dd6e8/VuJ6ayS--Ot8i-715fT3a.png" width="800" style="border-radius: 10px; box-shadow: 0 4px 20px rgba(0,0,0,0.3);">
</p>
## Utilization Guidelines (Instruction Format)
For optimal retrieval performance, prepend the instruction mapping to the query text:
`Instruction: Retrieve the exact document that answers the following question. Query: [Your Query]`
|