Pass dtype through to Gemma init for proper torch_dtype support b81a726 verified LeoChen085 commited on 2 days ago
Init Gemma in float32, users control dtype via torch_dtype param 117e0c8 verified LeoChen085 commited on 2 days ago
Uniform float32 weights, clean dtype handling: modeling_slip.py 1cc2650 verified LeoChen085 commited on 2 days ago
Uniform float32 weights, clean dtype handling: config.json 3e8529f verified LeoChen085 commited on 2 days ago
Use set_default_dtype for uniform dtype init, assert batch size match in CrossAttention 28848e5 verified LeoChen085 commited on 2 days ago
Update configuration_slip.py: use HF torch_dtype instead of hardcoded bfloat16 42f77de verified LeoChen085 commited on 2 days ago
Update modeling_slip.py: use HF torch_dtype instead of hardcoded bfloat16 866f1ac verified LeoChen085 commited on 2 days ago
Update config.json: use HF torch_dtype instead of hardcoded bfloat16 822fff3 verified LeoChen085 commited on 2 days ago
Fix CrossAttention batch size mismatch: expand context to match query batch 08e9734 verified LeoChen085 commited on 2 days ago
Fix dtype mismatches: cast entire model to bfloat16 in init bcd17bc verified LeoChen085 commited on 2 days ago