Spaces:

GAInTech
/

feather-a10g-large-runtime

Paused

Update Feather a10g-large training runtime image

c475135 verified 7 days ago

335 Bytes

	/*
	* CuTe DSL decode kernels for Mamba-3 autoregressive generation.
	*
	* Phase 2: Optimized single-token SSM step for inference.
	* Phase 1: Not needed (training only, no generation).
	*
	* Fuses: input_proj + conv_step + ssm_step + output_proj
	* into a single kernel launch for minimal latency.
	*/
	// Stub: Phase 2 implementation