syx
commited on
Commit
·
5bf22c9
1
Parent(s):
9260b43
minor
Browse files
README.md
CHANGED
|
@@ -7,7 +7,7 @@ Qwen2-7B-ReLU is a variant of Qwen2-7B that replaces the SiLU/Swish activation f
|
|
| 7 |
|
| 8 |
## Key Features
|
| 9 |
|
| 10 |
-
- Replaces SiLU/Swish activation function with
|
| 11 |
- Maintains comparable or even better performance with the original Qwen2-7B
|
| 12 |
- Significantly increases activation sparsity, enabling further optimization and compression
|
| 13 |
I'll add this implementation detail to the README under a new "Technical Details" section, as this is an important architectural change that researchers should be aware of:
|
|
|
|
| 7 |
|
| 8 |
## Key Features
|
| 9 |
|
| 10 |
+
- Replaces SiLU/Swish activation function with dReLU
|
| 11 |
- Maintains comparable or even better performance with the original Qwen2-7B
|
| 12 |
- Significantly increases activation sparsity, enabling further optimization and compression
|
| 13 |
I'll add this implementation detail to the README under a new "Technical Details" section, as this is an important architectural change that researchers should be aware of:
|