Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Lanni-ni
/
stickbreaking_babylm_100m_2layer
like
0
Text Generation
Transformers
Safetensors
stickbreaking
custom_code
arxiv:
1910.09700
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
stickbreaking_babylm_100m_2layer
/
ops
261 kB
1 contributor
History:
1 commit
Lanni-ni
add remote code + model files
ecbbe65
verified
about 1 month ago
.ipynb_checkpoints
add remote code + model files
about 1 month ago
__pycache__
add remote code + model files
about 1 month ago
geometric_attention
add remote code + model files
about 1 month ago
__init__.py
69 Bytes
add remote code + model files
about 1 month ago
direction_sensitive_geometric.py
5.97 kB
add remote code + model files
about 1 month ago
direction_sensitive_geometric.py.bak
5.9 kB
add remote code + model files
about 1 month ago
forgetting_attention.py
47.3 kB
add remote code + model files
about 1 month ago
forgetting_attention_std.py
2.19 kB
add remote code + model files
about 1 month ago
framework_mock.py
520 Bytes
add remote code + model files
about 1 month ago
geometric_attention_final.py
2.82 kB
add remote code + model files
about 1 month ago
geometric_attention_std.py
5.8 kB
add remote code + model files
about 1 month ago
layer_with_visualization.py
1.26 kB
add remote code + model files
about 1 month ago
multi_head_attention.py
7.09 kB
add remote code + model files
about 1 month ago
multi_head_relative_pos_attention.py
9.64 kB
add remote code + model files
about 1 month ago
multi_head_relative_pos_attention.py.bak
9.56 kB
add remote code + model files
about 1 month ago
sliding_window_attention_std.py
2.39 kB
add remote code + model files
about 1 month ago
stickbreaking_attention_std.py
1.11 kB
add remote code + model files
about 1 month ago
transformer.py
7.3 kB
add remote code + model files
about 1 month ago
vanilla_attention_std.py
5.64 kB
add remote code + model files
about 1 month ago