fuvty commited on
Commit
b988ebf
Β·
verified Β·
1 Parent(s): 48104f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -6
README.md CHANGED
@@ -1,13 +1,45 @@
1
  ---
2
- title: C2C Demo
3
- emoji: πŸ“ˆ
4
- colorFrom: pink
5
- colorTo: red
6
  sdk: gradio
7
- sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
 
 
 
 
 
 
 
 
11
  ---
 
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Cache-to-Cache Communication Demo
3
+ emoji: πŸ”—
4
+ colorFrom: blue
5
+ colorTo: blue
6
  sdk: gradio
7
+ sdk_version: 5.9.1
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
+ tags:
12
+ - llm
13
+ - cache-to-cache
14
+ - model-communication
15
+ - kv-cache
16
+ short_description: Compare Single, Text-to-Text, and Cache-to-Cache inference
17
+ thumbnail: >-
18
+ https://cdn-uploads.huggingface.co/production/uploads/6445fd9ba56444c355dcbcba/R5YOyw0aoBENYJs8Ugnbi.png
19
  ---
20
+ # Cache-to-Cache Communication Demo
21
 
22
+ This Space demonstrates **Cache-to-Cache (C2C)** communication between Large Language Models, comparing three inference approaches side-by-side:
23
+
24
+ 1. **Single Model**: Standard inference with one model
25
+ 2. **Text-to-Text (T2T)**: Two-stage communication where Sharer model generates text β†’ Receiver model processes text
26
+ 3. **Cache-to-Cache (C2C)**: Direct KV-Cache communication between Sharer and Receiver
27
+
28
+ ## What is Cache-to-Cache?
29
+
30
+ It makes language models talk without words.
31
+
32
+ Cache-to-Cache (C2C) lets multiple LLMs communicate directly through their KV-caches instead of text, transferring deep semantics without token-by-token generation.
33
+
34
+ The payoff: up to 10% higher accuracy, 3–5% gains over text-based communication, and 2Γ— faster responses.
35
+
36
+ ## Citation
37
+
38
+ ```bibtex
39
+ @article{fu2025c2c,
40
+ title={Cache-to-Cache: Direct Semantic Communication Between Large Language Models},
41
+ author={Tianyu Fu and Zihan Min and Hanling Zhang and Jichao Yan and Guohao Dai and Wanli Ouyang and Yu Wang},
42
+ journal={arXiv preprint arXiv:2510.03215},
43
+ year={2025},
44
+ }
45
+ ```