Commit History

Upload model.safetensors with huggingface_hub
3daf17a
verified

ishanjmukherjee commited on

Clean up gitignore for HF push
b6709d6

ishanjmukherjee commited on

Show subtractions of HF and checkpoint key sets in inspection script
c27420e

ishanjmukherjee commited on

Fix PT to safetensors conversion
4b35203

ishanjmukherjee commited on

Add Evo 2 checkpoint download script to gitignore
31b7640

ishanjmukherjee commited on

Downgrade max sample length in custom config to avoid CUDA OOM error
c8e5dcb

ishanjmukherjee commited on

Remove more custom device management
67959c9

ishanjmukherjee commited on

Remove more manual device management logic from model.py
27b8600

ishanjmukherjee commited on

Replace TELinear with nn.Linear
420e913

ishanjmukherjee commited on

Add script to compare keys of HF model and checkpoint
266f58d

ishanjmukherjee commited on

Disable Flash Attention (for now)
3d180ae

ishanjmukherjee commited on

Fix tokenization decoding error
8609610

ishanjmukherjee commited on

Remove device management from StripedHyena forward pass for x
14f37f0

ishanjmukherjee commited on

Change past_key_values indexing from hyena (inherited from Together's Evo 1 HF code) to hcl, hcm and hcs
00d76a9

ishanjmukherjee commited on

Add use_cache argument in config definition
4caf2e9

ishanjmukherjee commited on

Change generate() arguments to the format expected by Evo 2
9daa8fb

ishanjmukherjee commited on

Fix relative import of attention
8ca747b

ishanjmukherjee commited on

Custom-written HF glue; config JSON is translated from Evo 2 YAML and Python is based on Together's HF port code
c999bb6

ishanjmukherjee commited on

Copy rotary from vortex; drop-in replace vortex.ops apply_rotary with flash_attn's apply_rotary
305b72a

ishanjmukherjee commited on

Copy Python verbatim from vortex
43539ed

ishanjmukherjee commited on

Add tokenizer files verbatim from Together's Evo 1 HF
db299d8

ishanjmukherjee commited on