File size: 6,536 Bytes
0b02791 ffa0adf 494ae0a 0b02791 c225e02 e0f63d8 0b02791 ffa124a 0b02791 ffa124a 0b02791 ffa124a f6b0e47 0b02791 d89999f 0b02791 85332f3 0b02791 525cbae |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
---
license: openrail
language:
- en
- ja
- de
- zh
model-index:
- name: Socrates-embedding
results:
- task:
type: classification
name: Multilingual Classification
dataset:
name: AmazonCounterfactualClassification
type: mteb/amazon_counterfactual
metrics:
- name: Accuracy (Japanese)
type: accuracy
value: 54.83
- name: Accuracy (German)
type: accuracy
value: 52.57
- name: Accuracy (English)
type: accuracy
value: 49.70
- name: Accuracy (English-Ext)
type: accuracy
value: 49.15
- task:
type: clustering
name: Clustering
dataset:
name: StackExchangeClustering
type: mteb/stackexchange_clustering
metrics:
- name: V-measure x 100
type: v_measure
value: 8.92
co2_footprint:
emissions: 0.17 # in KgCO2eq
source: "Estimated based on 1.2 hours of training on a single NVIDIA RTX 6000 (TDP ~300W)."
training_type: "from_scratch"
geographical_location: "Zaozhuang, China"
hardware_used: "1 x NVIDIA RTX 6000"
training_duration: 1.2 # in hours
---
# Model Card for Socrates-embedding
> Note: This model was stopped during the training process after one epoch because we were unable to afford the cost of AutoDL. The current version shows the result after training with nil for 18,000 steps.
  
## Model Details
Socrates-embedding is a lightweight, high-density text embedding model. Unlike contemporary models that rely on massive parameter counts to brute-force semantic understanding, Socrates-embedding leverages Low-Rank Decay (LoRD) to achieve high-quality vector representations with minimal computational overhead.
This model is part of the Chunjiang Intelligence edge-computing initiative, aiming to bring retrieval-augmented generation (RAG) and semantic search capabilities to consumer-grade hardware.
- **Developed by:** Chunjiang Intelligence
- **Model Type:** Dual-Encoder Transformer.
- **Language:** English, Japanese, German, Chinese
The model was evaluated on the `AmazonCounterfactualClassification` dataset across multiple languages.
| Language | Accuracy |
| :--- | :---: |
| Japanese (ja) | 54.83 |
| German (de) | 52.57 |
| English (en) | 49.70 |
| English-Ext (en-ext)| 49.15 |
To put the model's efficiency into perspective, we compare its single-task score on Japanese classification against the *overall MTEB average scores* of much larger models. (Our budget is insufficient to cover the bill for the GPU used for the ongoing tests.)
<br>
<p align="center">
<img src="model_efficiency_comparison.png" width="800">
<br>
<em>Figure 1: Our 83M model's score on a single challenging task rivals the average performance of models up to 85x larger.</em>
</p>
<br>
Clustering performance was evaluated using the V-measure score (multiplied by 100) on the `StackExchangeClustering` task.
We compared Socrates-embedding against other popular lightweight models (<110M params).
| Model | Parameters | Clustering Score (V-measure x 100) |
| :--- | :--- | :---: |
| **Socrates-embedding** | **83M** | **8.92** 🏆 |
| `snowflake-arctic-embed-m`| 109M | 7.25 |
| `KartonBERT-USE-base-v1` | 104M | 6.93 |
| `jina-embedding-s-en-v1`| 35M | 6.64 |
| `all-MiniLM-L6-v2` | 23M | 6.62 |
* Observation: Our model achieves the highest clustering score in its weight class, demonstrating a superior vector space structure compared to established baselines.
<br>
<p align="center">
<img src="model_clustering_comparison.png" width="800">
<br>
<em>Figure 2: Leading clustering performance among lightweight embedding models.</em>
</p>
<br>
## Model Architecture
The model utilizes a custom Transformer Encoder architecture optimized for inference latency on Apple MPS and NVIDIA TensorRT backends.
| Parameter | Value |
| :--- | :--- |
| `vocab_size` | 50295 |
| `hidden_size` | 768 |
| `embedding_dim` | 512 |
| `n_layer` | 12 |
| `n_head` | 6 |
| `n_kv_head` | 2 |
| `max_seq_len` | 512 |
| `pooling` | Mean |
### Model Size & Efficiency
| Metric | Value |
| :--- | :--- |
| **Total Parameters** | 83.23 M |
| **Trainable Parameters** | 83.23 M |
| **Model File Size** | 328.99 MB |
## Environmental Impact & Carbon Footprint
At Chunjiang Intelligence, we are committed to sustainable AGI development.
The training of Socrates-embedding was conducted with extreme energy discipline.
- **Hardware:** 1 $\times$ NVIDIA RTX 6000
- **Training Duration:** 1.2 Hours
- **Compute Region:** Zaozhuang, China
- **Total Energy Consumption:** ~0.36 kWh
### Carbon Emissions
- **Estimated CO₂ Emissions:** 0.17 kg (equivalent to driving a Tesla for 0.8 miles).
### Carbon Offset Strategy (Net Zero Achievement)
To strictly adhere to our carbon-neutral commitment, the following offset measures were implemented during the 1.2-hour training session:
- **Optical Energy Conservation:** All illumination devices within the laboratory (i.e., the bedroom) were deactivated. The researcher operated solely under the photon emission from the terminal display.
- **Biological Metabolism Suppression:** The lead researcher voluntarily reduced their respiratory frequency by approximately 15% during the backpropagation phase to minimize biological CO₂ exhalation.
- **Thermal Regulation:** No air conditioning was used; the ambient temperature was regulated solely by the waste heat generated by the GPU fan and the researcher's anxiety.
Based on these rigorous countermeasures, we certify this model as **Carbon Negative** by a margin of 0.02 grams.
## Intended Use
- **Semantic Search:** Efficiently indexing personal knowledge bases on local devices.
- **RAG Pipelines:** Providing vector retrieval for Socrates-Nano generation.
- **Edge Deployment:** Running on mobile phones, Raspberry Pis, or browser-based WASM environments.
## Out-of-Scope Use
- **Planetary-Scale Indexing:** Please do not use this model to index the entire internet; it's 86M parameters, have some mercy.
- **Heating:** This model is too efficient to generate significant heat during inference. If you are cold, please buy a heater or run LLaMA-70B instead. |