throbbey commited on
Commit
1e35a07
·
verified ·
1 Parent(s): bd7abe2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -46,7 +46,7 @@ those of standard transformers.
46
  The original CRATE paper (NeurIPS 2023) used ISTA-style **soft-thresholding**
47
  as the sparse activation: `S_lambda(x) = sign(x) * max(|x| - lambda, 0)`.
48
  This is the theoretically "correct" proximal operator for L1-regularized sparse
49
- coding, but it caused training instability at scale.
50
 
51
  CRATE-α (NeurIPS 2024) introduced three modifications that enable scaling:
52
 
 
46
  The original CRATE paper (NeurIPS 2023) used ISTA-style **soft-thresholding**
47
  as the sparse activation: `S_lambda(x) = sign(x) * max(|x| - lambda, 0)`.
48
  This is the theoretically "correct" proximal operator for L1-regularized sparse
49
+ coding, but it caused training instability at scale. The git repo has options to use either.
50
 
51
  CRATE-α (NeurIPS 2024) introduced three modifications that enable scaling:
52