added this

LFM2.5-1.2B-Thinking excels at long-context inference. For example, on AMD Ryzen™ NPUs with FastFlowLM, decoding throughput sustains ~52 tok/s at 16K context and ~46 tok/s even at the full 32K context, indicating robust long-context scalability. For more details on longer context benchmarks on AMD Ryzen™ NPUs with FastFlowLM, please review these here.

to light long context capability

minor correction on FLM perf. data

mlabonne changed pull request status to merged

Sign up or log in to comment