AbstractPhil commited on
Commit
91eb06a
·
verified ·
1 Parent(s): eddd60f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -51,7 +51,8 @@ former = svd_transformer(
51
  # "sigmoid" = Sigmoid, 1 / (1 + exp(-x)), can be effective for certain tasks as it allows for values between 0 and 1 and can capture more complex relationships in the data.
52
  # "leaky_relu" = Leaky ReLU, max(0.01 * x, x), can be effective for certain tasks as it allows for a small gradient when the input is negative, which can help prevent dead neurons in the network.
53
  # "swilu" = Sigmoid Weighted Linear Unit, x * sigmoid(x), similar to silu but with a slightly different formulation, can also be effective for certain tasks.
54
- token_out="QKV", # the format of token expected out
 
55
  # "QKV" standard attention token, applies transformer logic internally and can accept rotary behavior
56
  # "SUVt" or "SUV" geometric tokens returned only, QKV transformation learning not applied.
57
  target="SVD", # "SVD" targets all 3, good for complex tasks.
 
51
  # "sigmoid" = Sigmoid, 1 / (1 + exp(-x)), can be effective for certain tasks as it allows for values between 0 and 1 and can capture more complex relationships in the data.
52
  # "leaky_relu" = Leaky ReLU, max(0.01 * x, x), can be effective for certain tasks as it allows for a small gradient when the input is negative, which can help prevent dead neurons in the network.
53
  # "swilu" = Sigmoid Weighted Linear Unit, x * sigmoid(x), similar to silu but with a slightly different formulation, can also be effective for certain tasks.
54
+ token_out="all", # the format of token expected out
55
+ # "all" or None will return all tokens, which applies transformer logic automatically.
56
  # "QKV" standard attention token, applies transformer logic internally and can accept rotary behavior
57
  # "SUVt" or "SUV" geometric tokens returned only, QKV transformation learning not applied.
58
  target="SVD", # "SVD" targets all 3, good for complex tasks.