Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
qtris123
's Collections
qwen-qasper-sparse-fintuning
longhealth-sparse-finetuning
qasper-sparse-finetuning
Longhealth_P11-20
Longhealth_P1-10
QASPER_SA_Task
QASPER_MT_Task
QASPER_QA_Task
qasper-sparse-finetuning
updated
12 days ago
if bsize not mentioned -> bsize = 32; if num-tokens not mentioned -> n-tokens = 1024
Upvote
-
Sort: Collection
qtris123/qasper-initial-per-layer-all-reduce
Updated
17 days ago
qtris123/qasper-initial-per-head-all-reduce
Updated
17 days ago
qtris123/qasper-per-layer-top-64-key-value-all-reduce
Updated
16 days ago
qtris123/qasper-per-layer-top-128-key-value-all-reduce
Updated
16 days ago
qtris123/qasper-per-layer-top-256-key-value-all-reduce
Updated
16 days ago
qtris123/qasper-per-layer-top-512-key-value-all-reduce
Updated
16 days ago
qtris123/qasper-per-head-top-64-key-value-all-reduce
Updated
16 days ago
qtris123/qasper-per-head-top-128-key-value_all-reduce
Updated
16 days ago
qtris123/qasper-per-head-top-512-key-value-all-reduce
Updated
16 days ago
qtris123/qasper-per-head-top-256-key-value-all-reduce
Updated
16 days ago
qtris123/qasper-per-layer-top-64-value-only-all-reduce
Updated
15 days ago
qtris123/qasper-per-layer-top-128-value-only-all-reduce
Updated
15 days ago
qtris123/qasper-per-layer-top-256-value-only-all-reduce
Updated
15 days ago
qtris123/qasper-per-layer-top-512-value-only-all-reduce
Updated
15 days ago
qtris123/qasper-per-head-top-64-value-only-all-reduce
Updated
15 days ago
qtris123/qasper-per-head-top-128-value-only-all-reduce
Updated
15 days ago
qtris123/qasper-per-head-top-256-value-only-all-reduce
Updated
15 days ago
qtris123/qasper-per-head-top-512-value-only-all-reduce
Updated
15 days ago
qtris123/qasper-per-head-top-64-value-only-all-reduce-bsize-64
Updated
15 days ago
qtris123/qasper-per-head-top-128-value-only-all-reduce-bsize-64
Updated
15 days ago
qtris123/qasper-per-head-top-256-value-only-all-reduce-bsize-64
Updated
15 days ago
qtris123/qasper-per-head-top-512-value-only-all-reduce-bsize-64
Updated
15 days ago
qtris123/qasper-per-layer-top-64-value-only-all-reduce-bsize-64
Updated
15 days ago
qtris123/qasper-per-layer-top-128-value-only-all-reduce-bsize-64
Updated
15 days ago
qtris123/qasper-per-layer-top-256-value-only-all-reduce-bsize-64
Updated
15 days ago
qtris123/qasper-per-layer-top-512-value-only-all-reduce-bsize-64
Updated
15 days ago
qtris123/qasper-per-head-top-32-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-per-head-top-64-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-per-head-top-128-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-per-head-top-256-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-per-layer-top-32-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-per-layer-top-64-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-per-layer-top-128-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-per-layer-top-256-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-initial-per-layer-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-initial-per-head-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-initial-global-all-reduce_num-tokens-512
Updated
12 days ago
qtris123/qasper-global-top-32-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-global-top-64-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-global-top-128-key-value-all-reduce-num-tokens-512
Updated
12 days ago
qtris123/qasper-global-top-256-key-value-all-reduce-num-tokens-512
Updated
12 days ago
Upvote
-
Sort: Collection
Share collection
View history
Collection guide
Browse collections