Spaces:
Runtime error
Runtime error
Commit History
feat: scan layers + gradient checkpointing (#161) 07a6f9a unverified
Merge branch 'main' of https://github.com/borisdayma/dalle-mini into main bcd360f
feat: better multi-node support (#158) 728a3c3 unverified
feat(text): support emojis (#154) 7ef7bd9 unverified
fix: smelu 7f2f8ed
fix: sinkformer 2c583b3
fix: support smelu a2dcee4
feat: allow relative position (#156) 769d20a unverified
feat: sinkhorn in lse mode (#155) 00d4661 unverified
fix: sinkformer gradient eed4896
feat(model): allow bias (#152) 361a994 unverified
feat: add sinkformer + custom final ln + pre-ln (#151) f139b0b unverified
feat: placeholders for more config 69bcbeb
feat: force final ln in encoder 32f4ba5
feat: allow more configurations 5bd4c20
fix: DeepNet doesn't scale weights of embedding/output layers (#150) 503d6b4 unverified
Shuming Ma Shuming Ma commited on
feat: remove unecessary LN 02824a7
feat: add cogview 472c4cc
fix(textnormalizer): consider utf8 on windows (#148) 3b8d8cb unverified
illtellyoulater commited on
feat: implement transformer variants (#144) 542378c unverified
feat(data): super conditioning (#141) 7939874 unverified
feat: support pod (#139) 803ccbf unverified
feat: handle gradient checkpointing 5173ec7
feat: load from bucket 1c4e839
feat: reduce artifact space + offset step 34cf91c
feat: restore weights on CPU 5f954fc
fix: position embedding for generate method ebac379
fix: typo 68cc185
fix: load from checkpoint 44b7c3e
feat(modeling): simplify abstract_init fa72aa7
feat(train) - handle multiple nodes (#130) 0952927 unverified
feat: handle model parallel 1bb3269
fix: style 386f839
style(tokenizer): remove unused variables 605df32
feat: use fast tokenizer 767d78a
feat(train): improve pjit speed f254058
fix(train): consider correct batch size b7c7458
feat(train): distributed_shampoo with pjit cc34d07
style: unsused import 7a176b9
feat(model): clean way to load on cpu 12f323d
feat(train): no batch dimension with pjit df1fe19
feat(train): progress on pjit 49597a2
feat: use_artifact if run existing a5ed112
Load from wandb artifact (#121) f69b21b unverified
Style (isort). f9d51f7
Pedro Cuenca commited on
Tokenizer, config, model can be loaded from wandb. 7e48337
Pedro Cuenca commited on
feat(data): support accumulation in non-streaming 88c8e06
feat: custom gradient accumulation 2d07559
Change import order again. 2b2be9b
Pedro Cuenca commited on