hyper-accel
/

ci-2layer-llama2-7b

Model card Files Files and versions

ci-2layer-llama2-7b

1.34 GB

Ctrl+K

Ctrl+K

1 contributor

History: 5 commits

ELutris's picture

V2: continued KD fine-tune at seq_len 1024, 500 steps, lr 1e-4 from V1 (alpaca-cleaned)

683b9c3 verified about 2 months ago

.gitattributes

1.52 kB
initial commit about 2 months ago
config.json

725 Bytes
KD-distilled 2-layer student against Llama-2-7B teacher (alpaca-cleaned, 1500 steps, T=2.0, KL loss) about 2 months ago
generation_config.json

194 Bytes
KD-distilled 2-layer student against Llama-2-7B teacher (alpaca-cleaned, 1500 steps, T=2.0, KL loss) about 2 months ago
model.safetensors

1.33 GB
xet

V2: continued KD fine-tune at seq_len 1024, 500 steps, lr 1e-4 from V1 (alpaca-cleaned) about 2 months ago
special_tokens_map.json

548 Bytes
Add first 2-layer slice of NousResearch/Llama-2-7b-hf about 2 months ago
tokenizer.json

3.62 MB
KD-distilled 2-layer student against Llama-2-7B teacher (alpaca-cleaned, 1500 steps, T=2.0, KL loss) about 2 months ago
tokenizer.model

500 kB
xet

Add first 2-layer slice of NousResearch/Llama-2-7b-hf about 2 months ago
tokenizer_config.json

370 Bytes
KD-distilled 2-layer student against Llama-2-7B teacher (alpaca-cleaned, 1500 steps, T=2.0, KL loss) about 2 months ago