feather820 commited on
Commit
14632b6
·
verified ·
1 Parent(s): b9fddd3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -2
README.md CHANGED
@@ -7,9 +7,17 @@ library_name: transformers
7
  tags:
8
  - code
9
  ---
 
 
 
 
 
 
 
 
10
  # Introduction
11
 
12
- C2LLM: Advanced Code Embeddings for Deep Semantic Understanding
13
 
14
  **C2LLMs (Code Contrastive Large Language Model)** is a powerful new model for generating code embeddings, designed to capture the deep semantics of source code.
15
 
@@ -19,7 +27,7 @@ C2LLM: Advanced Code Embeddings for Deep Semantic Understanding
19
  - **Intelligent Pooling with PMA**: Instead of traditional `mean pooling` or `last token pooling`, C2LLM uses **PMA (Pooling by Multi-head Attention)**. This allows the model to dynamically focus on the most critical parts of the code, creating a more informative and robust embedding.
20
  - **Trained for Retrieval**: C2LLM was fine-tuned on a massive dataset of **3 million query-document pairs**, optimizing it for real-world code retrieval and semantic search tasks. Supporting Text2Code/Code2Code/Code2Text tasks.
21
 
22
- C2LLM is designed to be a go-to model for tasks like code search and Retrieval-Augmented Generation (RAG).
23
 
24
  # Model Details
25
 
@@ -128,6 +136,13 @@ cache = ResultCache("./c2llm_results")
128
  results = mteb.evaluate(model, tasks=tasks, cache=cache, encode_kwargs={"batch_size": 16})
129
  ```
130
 
 
 
 
 
 
 
 
131
  ## Correspondence to
132
 
133
  Jin Qin (qj431428@antgroup.com), Zihan Liao (liaozihan.lzh@antgroup.com), Ziyin Zhang (zhangziying.zzy@antgroup.com), Hang Yu (hyu.hugo@antgroup.com), Peng Di (dipeng.dp@antgroup.com)
 
7
  tags:
8
  - code
9
  ---
10
+
11
+ <div align="center" style="display: flex; justify-content: center; align-items: center; gap: 20px;">
12
+ <a href="https://github.com/codefuse-ai/CodeFuse-Embeddings/tree/main/" style="display: flex; align-items: center; text-decoration: none; color: inherit;">
13
+ <img src="https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.png" width="30" height="30" style="vertical-align: middle; margin-right: 8px;">
14
+ <span style="font-size: 1.5em; font-weight: bold;">CodeFuse-Embeddings</span>
15
+ </a>
16
+ </div>
17
+
18
  # Introduction
19
 
20
+ ## C2LLM: Advanced Code Embeddings for Deep Semantic Understanding
21
 
22
  **C2LLMs (Code Contrastive Large Language Model)** is a powerful new model for generating code embeddings, designed to capture the deep semantics of source code.
23
 
 
27
  - **Intelligent Pooling with PMA**: Instead of traditional `mean pooling` or `last token pooling`, C2LLM uses **PMA (Pooling by Multi-head Attention)**. This allows the model to dynamically focus on the most critical parts of the code, creating a more informative and robust embedding.
28
  - **Trained for Retrieval**: C2LLM was fine-tuned on a massive dataset of **3 million query-document pairs**, optimizing it for real-world code retrieval and semantic search tasks. Supporting Text2Code/Code2Code/Code2Text tasks.
29
 
30
+ C2LLM is designed to be a go-to model for tasks like code search and Retrieval-Augmented Generation (RAG). For more details, please see our [GitHub repository](https://github.com/codefuse-ai/CodeFuse-Embeddings/tree/main).
31
 
32
  # Model Details
33
 
 
136
  results = mteb.evaluate(model, tasks=tasks, cache=cache, encode_kwargs={"batch_size": 16})
137
  ```
138
 
139
+ ## Support Us
140
+
141
+ If you find this project helpful, please give it a star. It means a lot to us!
142
+
143
+ [![GitHub stars](https://img.shields.io/github/stars/codefuse-ai/CodeFuse-Embeddings?style=social)](https://github.com/codefuse-ai/CodeFuse-Embeddings/tree/main)
144
+
145
+
146
  ## Correspondence to
147
 
148
  Jin Qin (qj431428@antgroup.com), Zihan Liao (liaozihan.lzh@antgroup.com), Ziyin Zhang (zhangziying.zzy@antgroup.com), Hang Yu (hyu.hugo@antgroup.com), Peng Di (dipeng.dp@antgroup.com)