JayThinkDiff
/

CRE-1.1

@@ -15,20 +15,22 @@ base_model:
 - Qwen/Qwen3-Embedding-8B
 ---
-# CRE v1.1: CareerInternational Recruitment Embedding Model 🚀
-> **CRE v1.1** 是一款基于大语言模型（LLM-based）的招聘领域适配嵌入模型。相较于传统 BERT 类模型，它通过长上下文融合与指令控制，展现出极强的语义表征优势，完美解决岗位描述（JD）与简历（CV）之间的异构文本对齐难题。
 ---
-### 📖 技术背景 (Technical Report Summary)
-**2025/06/28 Released the CRE v1.1 model and technical report.** 本研究探究了 LLM-based Embedding 模型在招聘语义匹配任务中的领域适配机制。核心研究结论证明了：
 1. **适配训练范式的有效性**：采用 **LoRA 轻量微调** 结合 **领域合成数据**，显著提升了模型在 JD2JD、JD2CV、CV2CV 三类核心匹配任务上的性能。
 2. **技术演进的新趋势**：LLM-based Embedding 天然支持多粒度语义解析（如技能上下位关系捕捉），有效规避了传统模型的结构性瓶颈。
 3. **工业部署价值**：在训练阶段使用**增强查询构造**（Enhanced Query Construction）、测试阶段直接应用原始查询的设定下，模型表现出极强的鲁棒性与实用性。
 ---
 ### 核心特性 (Key Features)
@@ -40,13 +42,7 @@ base_model:
 ---
-### 快速上手 (Quick Start)
-#### 1. 环境依赖
-```bash
-pip install "transformers>=4.51.0" "sentence-transformers>=2.7.0"
 ### Using Sentence-Transformers
 ```python
 from sentence_transformers import SentenceTransformer, util
@@ -62,22 +58,14 @@ print("查询结果:", util.cos_sim(query_embedding, passage_embedding))
 ```
 ### 📊 预期结果对比 (Expected Output Comparison)
-| 模型名称 (Model) | 相似度 1 (与简历 1) | 相似度 2 (与简历 2) |
-| :--- | :---: | :---: |
-| **CRE-1.1** | 0.5816 | **0.6093** |
-| **Qwen3-Embedding-8B** | **0.7731** | 0.7638 |
 ### 🛠️ 技术规格 (Technical Specifications)
-* **Pooling Strategy**: 推荐使用模型默认的表征方式（通常为末尾 Token 或 CLS）。
-* **Inference Optimization**: 处理长文本时，强烈建议开启 `flash_attention_2` 并设置 `padding_side="left"`。
 * **Task Support**: 针对招聘领域的 JD2JD、JD2CV、CV2CV 等任务进行了深度优化。
----
-### 📜 Citation
-If you find this research and model helpful in your recruitment matching tasks, please cite our technical report:
-```text
-CRE v1.1 Team. (2025). Domain Adaptation Mechanism of LLM-based Embedding Models in Recruitment Semantic Matching. CareerInternational AI Lab Technical Report.

 - Qwen/Qwen3-Embedding-8B
 ---
+# CRE: CareerInternational Recruitment Embedding Model 🚀
+> **CRE-1.1** 是一款基于大语言模型（LLM-based）的招聘领域适配嵌入模型。相较于传统 BERT 类模型，它通过长上下文融合与指令控制，展现出极强的语义表征优势，优化了岗位描述（JD）与简历（CV）之间的异构文本对齐难题。
 ---
+### 更新日志 (Release Notes)
+* **2026/06/28**: 发布 **CRE-1.1**，优化长文本特征提取与推理性能。
+* **2025/03/28**: 发布 **CRE-0.5** 初始版本及技术报告。
+### 📖 技术背景 (Technical Report Summary)
+本研究探究了 LLM-based Embedding 模型在招聘语义匹配任务中的领域适配机制。核心研究结论证明了：
 1. **适配训练范式的有效性**：采用 **LoRA 轻量微调** 结合 **领域合成数据**，显著提升了模型在 JD2JD、JD2CV、CV2CV 三类核心匹配任务上的性能。
 2. **技术演进的新趋势**：LLM-based Embedding 天然支持多粒度语义解析（如技能上下位关系捕捉），有效规避了传统模型的结构性瓶颈。
 3. **工业部署价值**：在训练阶段使用**增强查询构造**（Enhanced Query Construction）、测试阶段直接应用原始查询的设定下，模型表现出极强的鲁棒性与实用性。
 ---
 ### 核心特性 (Key Features)
 ---
 ### Using Sentence-Transformers
 ```python
 from sentence_transformers import SentenceTransformer, util
 ```
 ### 📊 预期结果对比 (Expected Output Comparison)
+| 模型名称 (Model)        | 相似度 1 (与简历 1) | 相似度 2 (与简历 2) |
+| :---                   | :---:             | :---:             |
+| **CRE-1.1**            | 0.5816            | **0.6093**        |
+| **Qwen3-Embedding-8B** | **0.7731**        | 0.7638            |
 ### 🛠️ 技术规格 (Technical Specifications)
+* **Pooling Strategy**: 推荐使用模型默认的表征方式（last token pooling）。
 * **Task Support**: 针对招聘领域的 JD2JD、JD2CV、CV2CV 等任务进行了深度优化。
+---