--- license: apache-2.0 library_name: pytorch tags: - sentence-transformers - feature-extraction - sentence-similarity - mteb - hcae --- # HCAE-21M (Hybrid Convolutional-Attention Encoder) **HCAE-21M** is a mid-scale (21 Million parameters) text embedding model combining Depthwise Separable Convolutions and Self-Attention layers. It achieves high performance on Semantic Textual Similarity and Retrieval tasks while remaining extremely memory-efficient.

## Architecture Description - **Size:** ~21M parameters (d_model=384) - **Lower Layers:** 5 layers of Depthwise Separable Conv1d + FFN. - **Upper Layers:** 3 layers of Multihead Self-Attention. - **Pooling Strategy:** Global Mean Pooling. ## Benchmark Comparison (MTEB) This table delineates the performance disparities between architectural iterations: | Model Revision | STSBenchmark (Spearman) | SciFact (Recall@10) | Description | |---|---|---|---| | **HCAE-21M-Base** | `0.507` | `0.324` | Baseline configuration trained extensively on the MS MARCO dataset. | | **HCAE-21M-Instruct** | `0.591` | `0.393` | Multi-stage tuning incorporating ArXiv, STS-B, and SQuAD instruction tuning paradigms. |

## Utilization Guidelines (Instruction Format) For optimal retrieval performance, prepend the instruction mapping to the query text: `Instruction: Retrieve the exact document that answers the following question. Query: [Your Query]`