deeplm-108m / deeplm

Commit History

fix: remove non-existent add_entry call in _commit_changes
fdf7357

Luvion1 commited on

fix: _metrics -> metrics in _commit_changes
148593d

Luvion1 commited on

fix: batch=2 accum=1 for T4 VRAM
fcd3691

Luvion1 commited on

fix: hash only model.safetensors+config.json for upload skip detection (was hashing entire tmpdir incl checkpoints/charts)
c4fba4b

Luvion1 commited on

maintenance: atomic checkpoints, error logging, dead code removal, config alignment, KV cache cap, robustness fixes
1d7b0ca

Luvion1 commited on

fix: compiled model truthiness in evolution bug_diagnosis
393025d

Luvion1 commited on

fix: compiled model truthiness check (is not None, not __len__)
e1fdac6

Luvion1 commited on

fix: batch=20 accum=2 to fit T4 VRAM with torch.compile
97f520b

Luvion1 commited on

fix: torch.compile force_parameter_static_shapes=False for bitnet weight shapes
8419f03

Luvion1 commited on

fix: torch.compile backend=eager (inductor incompatible with buffer mutations)
fa6ad4e

Luvion1 commited on

fix: pass num_heads to MLAConfig; always update flat_grad on shape mismatch
605fddd

Luvion1 commited on

feat: torch.compile, batch=40 accum=1
27d6040

Luvion1 commited on

fix: skip empty HF commits via content hash check
deaa5df

Luvion1 commited on

fix: use pre-clip grad norm; faster cos_sim EMA (0.3), skip stale flat_grad on shape mismatch
7d02df8

Luvion1 commited on

fix: recursive nested dataclass update in config loading
ba0aa86

Luvion1 commited on

fix: HybridAttention missing rope_theta/max_seq_len attributes
9895430

Luvion1 commited on

bug fixes & improvements: chart string-safe, HF/kaggle_train split, autonomous module, curriculum router guards
d33a3e8

Luvion1 commited on

fix: include kaggle_train.py in dataset sync
71502d8

Luvion1 commited on

fix: missing _analyze_logs method
2698eb2

Luvion1 commited on

fix: guard curriculum_router missing methods
8fbf626

Luvion1 commited on

fix: HF serialization with _clean_for_json
0eefcdd

Luvion1 commited on

fix: total_tokens resume offset
9523c8c

Luvion1 commited on

full self-evolution: trainer integration, real changes, auto episodes
cdec3b1

Luvion1 commited on

update deeplm/ code, remove data/
80dc120

Luvion1 commited on

Initial upload: source + config + tokenizer + charts + metrics
321bd39
verified

samcheng0 commited on

update deeplm/ source code (full: model, training, quantization, self_evolution, data, inference)
54f655b
verified

samcheng0 commited on

Delete deeplm/scripts/train_bitnet.py with huggingface_hub
d40df2f
verified

samcheng0 commited on

Delete deeplm/scripts/save_model.py with huggingface_hub
12fab3f
verified

samcheng0 commited on

Delete deeplm/deeplm/training/trainer.py with huggingface_hub
30c72b3
verified

samcheng0 commited on

Delete deeplm/deeplm/training/logger.py with huggingface_hub
8814b7e
verified

samcheng0 commited on

Delete deeplm/deeplm/training/data_pipeline.py with huggingface_hub
731349f
verified

samcheng0 commited on

Delete deeplm/deeplm/training/curriculum_router.py with huggingface_hub
f64570a
verified

samcheng0 commited on

Delete deeplm/deeplm/training/control/training_control.py with huggingface_hub
cd717b9
verified

samcheng0 commited on

Delete deeplm/deeplm/training/control/__init__.py with huggingface_hub
d271c4b
verified

samcheng0 commited on

Delete deeplm/deeplm/training/auto_tuner.py with huggingface_hub
4589d53
verified

samcheng0 commited on

Delete deeplm/deeplm/training/__init__.py with huggingface_hub
4b37dbd
verified

samcheng0 commited on

Delete deeplm/deeplm/self_evolution/framework.py with huggingface_hub
d5e8355
verified

samcheng0 commited on

Delete deeplm/deeplm/self_evolution/__init__.py with huggingface_hub
a7fe459
verified

samcheng0 commited on

Delete deeplm/deeplm/quantization/gguf_export.py with huggingface_hub
8b21a89
verified

samcheng0 commited on

Delete deeplm/deeplm/quantization/bitnet_quantize.py with huggingface_hub
8f88a8a
verified

samcheng0 commited on

Delete deeplm/deeplm/quantization/__init__.py with huggingface_hub
88578ac
verified

samcheng0 commited on

Delete deeplm/deeplm/model/transformer_block.py with huggingface_hub
723214e
verified

samcheng0 commited on

Delete deeplm/deeplm/model/mtp.py with huggingface_hub
2ca43f3
verified

samcheng0 commited on

Delete deeplm/deeplm/model/moe.py with huggingface_hub
686d82c
verified

samcheng0 commited on

Delete deeplm/deeplm/model/mla.py with huggingface_hub
b19b8e2
verified

samcheng0 commited on

Delete deeplm/deeplm/model/hyper_connections.py with huggingface_hub
62119a2
verified

samcheng0 commited on

Delete deeplm/deeplm/model/hybrid_attention.py with huggingface_hub
eb9458a
verified

samcheng0 commited on

Delete deeplm/deeplm/model/deeplm.py with huggingface_hub
b7e42c0
verified

samcheng0 commited on

Delete deeplm/deeplm/model/__init__.py with huggingface_hub
3960992
verified

samcheng0 commited on

Delete deeplm/deeplm/inference/generate.py with huggingface_hub
9e734cb
verified

samcheng0 commited on