feather820 commited on
Commit
369cecf
·
verified ·
1 Parent(s): 23fc153

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -9,6 +9,13 @@ tags:
9
  ---
10
  # Introduction
11
 
 
 
 
 
 
 
 
12
  C2LLM: Advanced Code Embeddings for Deep Semantic Understanding
13
 
14
  **C2LLMs (Code Contrastive Large Language Model)** is a powerful new model for generating code embeddings, designed to capture the deep semantics of source code.
@@ -19,7 +26,7 @@ C2LLM: Advanced Code Embeddings for Deep Semantic Understanding
19
  - **Intelligent Pooling with PMA**: Instead of traditional `mean pooling` or `last token pooling`, C2LLM uses **PMA (Pooling by Multi-head Attention)**. This allows the model to dynamically focus on the most critical parts of the code, creating a more informative and robust embedding.
20
  - **Trained for Retrieval**: C2LLM was fine-tuned on a massive dataset of **3 million query-document pairs**, optimizing it for real-world code retrieval and semantic search tasks. Supporting Text2Code/Code2Code/Code2Text tasks.
21
 
22
- C2LLM is designed to be a go-to model for tasks like code search and Retrieval-Augmented Generation (RAG).
23
 
24
  # Model Details
25
 
@@ -128,6 +135,12 @@ cache = ResultCache("./c2llm_results")
128
  results = mteb.evaluate(model, tasks=tasks, cache=cache, encode_kwargs={"batch_size": 16})
129
  ```
130
 
 
 
 
 
 
 
131
  ## Correspondence to
132
 
133
  Jin Qin (qj431428@antgroup.com), Zihan Liao (liaozihan.lzh@antgroup.com), Ziyin Zhang (zhangziying.zzy@antgroup.com), Hang Yu (hyu.hugo@antgroup.com), Peng Di (dipeng.dp@antgroup.com)
 
9
  ---
10
  # Introduction
11
 
12
+ <h1 align="center">
13
+ <a href="https://github.com/codefuse-ai/CodeFuse-Embeddings/tree/main/">
14
+ <img src="https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.png" width="30" height="30" style="vertical-align: middle; margin-right: 8px;">
15
+ CodeFuse-Embeddings
16
+ </a> |
17
+ </h1>
18
+
19
  C2LLM: Advanced Code Embeddings for Deep Semantic Understanding
20
 
21
  **C2LLMs (Code Contrastive Large Language Model)** is a powerful new model for generating code embeddings, designed to capture the deep semantics of source code.
 
26
  - **Intelligent Pooling with PMA**: Instead of traditional `mean pooling` or `last token pooling`, C2LLM uses **PMA (Pooling by Multi-head Attention)**. This allows the model to dynamically focus on the most critical parts of the code, creating a more informative and robust embedding.
27
  - **Trained for Retrieval**: C2LLM was fine-tuned on a massive dataset of **3 million query-document pairs**, optimizing it for real-world code retrieval and semantic search tasks. Supporting Text2Code/Code2Code/Code2Text tasks.
28
 
29
+ C2LLM is designed to be a go-to model for tasks like code search and Retrieval-Augmented Generation (RAG). For more details, please see our [GitHub repository](https://github.com/codefuse-ai/CodeFuse-Embeddings/tree/main).
30
 
31
  # Model Details
32
 
 
135
  results = mteb.evaluate(model, tasks=tasks, cache=cache, encode_kwargs={"batch_size": 16})
136
  ```
137
 
138
+ ## Support Us
139
+
140
+ If you find this project helpful, please give it a star. It means a lot to us!
141
+
142
+ [![GitHub stars](https://img.shields.io/github/stars/codefuse-ai/CodeFuse-Embeddings?style=social)](https://github.com/codefuse-ai/CodeFuse-Embeddings/tree/main)
143
+
144
  ## Correspondence to
145
 
146
  Jin Qin (qj431428@antgroup.com), Zihan Liao (liaozihan.lzh@antgroup.com), Ziyin Zhang (zhangziying.zzy@antgroup.com), Hang Yu (hyu.hugo@antgroup.com), Peng Di (dipeng.dp@antgroup.com)