ngocbh commited on
Commit
3c82660
·
verified ·
1 Parent(s): 3e611af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -116,12 +116,12 @@ For a runnable end-to-end example, see [`examples/test_qwen3.py`](examples/test_
116
 
117
  | Base Model | TRIM-KV Checkpoints | Training Datasets | Training Context Len | Training $M$ |
118
  |------------------------------|-----------------------------------------------|--------------------------|-------------------------|--------------|
119
- | Qwen3-1.7B | [TRIM-KV-Qwen3-1.7B-Math](https://huggingface.co/ngocbh/TrimKV-Qwen3-1.7B-Math) | OpenR1-Math-220k | 16K | 512 |
120
- | Qwen3-4B | [TRIM-KV-Qwen3-4B-Math](https://huggingface.co/ngocbh/TrimKV-Qwen3-4B-Math) | OpenR1-Math-220k | 16K | 512 |
121
- | Qwen3-8B | [TRIM-KV-Qwen3-8B-Math](https://huggingface.co/ngocbh/TrimKV-Qwen3-8B-Math) | OpenR1-Math-220k | 16K | 512 |
122
- | Qwen3-14B | [TRIM-KV-Qwen3-14B-Math](https://huggingface.co/ngocbh/TrimKV-Qwen3-14B-Math) | OpenR1-Math-220k | 16K | 512 |
123
- | Qwen3-4B-Instruct-2507 | [TrimKV-Qwen3-4B-Instruct-2507](https://huggingface.co/ngocbh/TrimKV-Qwen3-4B-Instruct-2507) | Synth-Long, BookSum, Buddhi | 128K | 4096 |
124
- | Phi-3-mini-128k-instruct | [TrimKV-Phi-3-mini-128k-instruct](https://huggingface.co/ngocbh/TrimKV-Phi-3-mini-128k-instruct) | LongAlpaca | 128K | 2048 |
125
- | DeepSeek-R1-Distill-Llama-8B | [TrimKV-DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/ngocbh/TrimKV-DeepSeek-R1-Distill-Llama-8B) | OpenR1-Math-220k | 32K | 512 |
126
 
127
  ---
 
116
 
117
  | Base Model | TRIM-KV Checkpoints | Training Datasets | Training Context Len | Training $M$ |
118
  |------------------------------|-----------------------------------------------|--------------------------|-------------------------|--------------|
119
+ | Qwen3-1.7B | [TRIM-KV-Qwen3-1.7B-Math](https://huggingface.co/ngocbh/TrimKV-Qwen3-1.7B-Math) | OpenR1-Math-220k | 16K | 256 |
120
+ | Qwen3-4B | [TRIM-KV-Qwen3-4B-Math](https://huggingface.co/ngocbh/TrimKV-Qwen3-4B-Math) | OpenR1-Math-220k | 16K | 256 |
121
+ | Qwen3-8B | [TRIM-KV-Qwen3-8B-Math](https://huggingface.co/ngocbh/TrimKV-Qwen3-8B-Math) | OpenR1-Math-220k | 16K | 256 |
122
+ | Qwen3-14B | [TRIM-KV-Qwen3-14B-Math](https://huggingface.co/ngocbh/TrimKV-Qwen3-14B-Math) | OpenR1-Math-220k | 16K | 256 |
123
+ | Qwen3-4B-Instruct-2507 | [TrimKV-Qwen3-4B-Instruct-2507](https://huggingface.co/ngocbh/TrimKV-Qwen3-4B-Instruct-2507) | Synth-Long, BookSum, Buddhi | 128K | 1024 |
124
+ | Phi-3-mini-128k-instruct | [TrimKV-Phi-3-mini-128k-instruct](https://huggingface.co/ngocbh/TrimKV-Phi-3-mini-128k-instruct) | LongAlpaca | 128K | 512 |
125
+ | DeepSeek-R1-Distill-Llama-8B | [TrimKV-DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/ngocbh/TrimKV-DeepSeek-R1-Distill-Llama-8B) | OpenR1-Math-220k | 32K | 256 |
126
 
127
  ---