File size: 6,536 Bytes

---
license: openrail
language:
- en
- ja
- de
- zh
model-index:
  - name: Socrates-embedding
    results:
      - task:
          type: classification
          name: Multilingual Classification
        dataset:
          name: AmazonCounterfactualClassification
          type: mteb/amazon_counterfactual
        metrics:
          - name: Accuracy (Japanese)
            type: accuracy
            value: 54.83
          - name: Accuracy (German)
            type: accuracy
            value: 52.57
          - name: Accuracy (English)
            type: accuracy
            value: 49.70
          - name: Accuracy (English-Ext)
            type: accuracy
            value: 49.15
      - task:
          type: clustering
          name: Clustering
        dataset:
          name: StackExchangeClustering
          type: mteb/stackexchange_clustering
        metrics:
          - name: V-measure x 100
            type: v_measure
            value: 8.92
co2_footprint:
  emissions: 0.17 # in KgCO2eq
  source: "Estimated based on 1.2 hours of training on a single NVIDIA RTX 6000 (TDP ~300W)."
  training_type: "from_scratch"
  geographical_location: "Zaozhuang, China"
  hardware_used: "1 x NVIDIA RTX 6000"
  training_duration: 1.2 # in hours
---

# Model Card for Socrates-embedding

> Note: This model was stopped during the training process after one epoch because we were unable to afford the cost of AutoDL. The current version shows the result after training with nil for 18,000 steps.

![Model Architecture](https://img.shields.io/badge/Model-Socrates--embedding-blue) ![Parameter Count](https://img.shields.io/badge/Params-83M-green) ![Carbon Footprint](https://img.shields.io/badge/Carbon-Neutral-brightgreen)

## Model Details

Socrates-embedding is a lightweight, high-density text embedding model. Unlike contemporary models that rely on massive parameter counts to brute-force semantic understanding, Socrates-embedding leverages Low-Rank Decay (LoRD) to achieve high-quality vector representations with minimal computational overhead.

This model is part of the Chunjiang Intelligence edge-computing initiative, aiming to bring retrieval-augmented generation (RAG) and semantic search capabilities to consumer-grade hardware.

-   **Developed by:** Chunjiang Intelligence
-   **Model Type:** Dual-Encoder Transformer.
-   **Language:** English, Japanese, German, Chinese


The model was evaluated on the `AmazonCounterfactualClassification` dataset across multiple languages.

| Language | Accuracy |
| :--- | :---: |
| Japanese (ja) | 54.83 |
| German (de) | 52.57 |
| English (en) | 49.70 |
| English-Ext (en-ext)| 49.15 |

To put the model's efficiency into perspective, we compare its single-task score on Japanese classification against the *overall MTEB average scores* of much larger models. (Our budget is insufficient to cover the bill for the GPU used for the ongoing tests.)

<br>

<p align="center">
  <img src="model_efficiency_comparison.png" width="800">
  <br>
  <em>Figure 1: Our 83M model's score on a single challenging task rivals the average performance of models up to 85x larger.</em>
</p>

<br>

Clustering performance was evaluated using the V-measure score (multiplied by 100) on the `StackExchangeClustering` task.

We compared Socrates-embedding against other popular lightweight models (<110M params).

| Model | Parameters | Clustering Score (V-measure x 100) |
| :--- | :--- | :---: |
| **Socrates-embedding** | **83M** | **8.92** 🏆 |
| `snowflake-arctic-embed-m`| 109M | 7.25 |
| `KartonBERT-USE-base-v1` | 104M | 6.93 |
| `jina-embedding-s-en-v1`| 35M | 6.64 |
| `all-MiniLM-L6-v2` | 23M | 6.62 |

*   Observation: Our model achieves the highest clustering score in its weight class, demonstrating a superior vector space structure compared to established baselines.

<br>

<p align="center">
  <img src="model_clustering_comparison.png" width="800">
  <br>
  <em>Figure 2: Leading clustering performance among lightweight embedding models.</em>
</p>

<br>

## Model Architecture

The model utilizes a custom Transformer Encoder architecture optimized for inference latency on Apple MPS and NVIDIA TensorRT backends.

| Parameter | Value |
| :--- | :--- |
| `vocab_size` | 50295 |
| `hidden_size` | 768 |
| `embedding_dim` | 512 |
| `n_layer` | 12 |
| `n_head` | 6 |
| `n_kv_head` | 2 |
| `max_seq_len` | 512 |
| `pooling` | Mean |

### Model Size & Efficiency

| Metric | Value |
| :--- | :--- |
| **Total Parameters** | 83.23 M |
| **Trainable Parameters** | 83.23 M |
| **Model File Size** | 328.99 MB |

## Environmental Impact & Carbon Footprint

At Chunjiang Intelligence, we are committed to sustainable AGI development.

The training of Socrates-embedding was conducted with extreme energy discipline.

-   **Hardware:** 1 $\times$ NVIDIA RTX 6000
-   **Training Duration:** 1.2 Hours
-   **Compute Region:** Zaozhuang, China
-   **Total Energy Consumption:** ~0.36 kWh

### Carbon Emissions

-   **Estimated CO₂ Emissions:** 0.17 kg (equivalent to driving a Tesla for 0.8 miles).

### Carbon Offset Strategy (Net Zero Achievement)

To strictly adhere to our carbon-neutral commitment, the following offset measures were implemented during the 1.2-hour training session:

-   **Optical Energy Conservation:** All illumination devices within the laboratory (i.e., the bedroom) were deactivated. The researcher operated solely under the photon emission from the terminal display.
-   **Biological Metabolism Suppression:** The lead researcher voluntarily reduced their respiratory frequency by approximately 15% during the backpropagation phase to minimize biological CO₂ exhalation.
-   **Thermal Regulation:** No air conditioning was used; the ambient temperature was regulated solely by the waste heat generated by the GPU fan and the researcher's anxiety.

Based on these rigorous countermeasures, we certify this model as **Carbon Negative** by a margin of 0.02 grams.

## Intended Use

-   **Semantic Search:** Efficiently indexing personal knowledge bases on local devices.
-   **RAG Pipelines:** Providing vector retrieval for Socrates-Nano generation.
-   **Edge Deployment:** Running on mobile phones, Raspberry Pis, or browser-based WASM environments.

## Out-of-Scope Use

-   **Planetary-Scale Indexing:** Please do not use this model to index the entire internet; it's 86M parameters, have some mercy.
-   **Heating:** This model is too efficient to generate significant heat during inference. If you are cold, please buy a heater or run LLaMA-70B instead.