meituan-longcat
/

LongCat-Flash-Chat-FP8

@@ -1,6 +1,6 @@
 ---
 license: mit
-library_name: LongCat-Flash-Chat
 pipeline_tag: text-generation
 tags:
 - transformers
@@ -8,6 +8,10 @@ tags:
 # LongCat-Flash-Chat
 <div align="center">
   <img src="https://raw.githubusercontent.com/meituan-longcat/LongCat-Flash-Chat/main/figures/longcat_logo.svg"
        width="300"
@@ -62,7 +66,7 @@ Effectively and efficiently scaling model size remains a key challenge in strate
 #### 🌟 Multi-Stage Training Pipeline for Agentic Capability
 Through a meticulously designed pipeline, LongCat-Flash is endowed with advanced agentic behaviors. Initial efforts focus on constructing a more suitable base model for agentic post-training, where we design a two-stage pretraining data fusion strategy to concentrate reasoning-intensive domain data. During mid-training, we enhance reasoning and coding capabilities while extending the context length to 128k to meet agentic post-training requirements. Building on this advanced base model, we proceed with a multi-stage post-training. Recognizing the scarcity of high-quality, high-difficulty training problems for agentic tasks, we design a multi-agent synthesis framework that defines task difficulty across three axes, i.e., information processing, tool-set complexity, and user interaction—using specialized controllers to generate complex tasks requiring iterative reasoning and environmental interaction.
-For more detail, please refer to the comprehensive [***LongCat-Flash Technical Report***](https://github.com/meituan-longcat/LongCat-Flash-Chat/blob/main/tech_report.pdf).
 ## Evaluation Results
 | **Benchmark** | **DeepSeek V3.1** | **Qwen3 MoE-2507** | **Kimi-K2** | **GPT-4.1** | **Claude4 Sonnet** | **Gemini2.5 Flash** | **LongCat-Flash** |
@@ -220,5 +224,4 @@ We kindly encourage citation of our work if you find it useful.
 ## Contact
-Please contact us at <a href="mailto:longcat-team@meituan.com">longcat-team@meituan.com</a> or open an issue if you have any questions.

 ---
+library_name: transformers
 license: mit
 pipeline_tag: text-generation
 tags:
 - transformers
 # LongCat-Flash-Chat
+Paper: [LongCat-Flash Technical Report](https://huggingface.co/papers/2509.01322)
+Project Page: https://longcat.ai
+GitHub Repository: https://github.com/meituan-longcat/LongCat-Flash-Chat
 <div align="center">
   <img src="https://raw.githubusercontent.com/meituan-longcat/LongCat-Flash-Chat/main/figures/longcat_logo.svg"
        width="300"
 #### 🌟 Multi-Stage Training Pipeline for Agentic Capability
 Through a meticulously designed pipeline, LongCat-Flash is endowed with advanced agentic behaviors. Initial efforts focus on constructing a more suitable base model for agentic post-training, where we design a two-stage pretraining data fusion strategy to concentrate reasoning-intensive domain data. During mid-training, we enhance reasoning and coding capabilities while extending the context length to 128k to meet agentic post-training requirements. Building on this advanced base model, we proceed with a multi-stage post-training. Recognizing the scarcity of high-quality, high-difficulty training problems for agentic tasks, we design a multi-agent synthesis framework that defines task difficulty across three axes, i.e., information processing, tool-set complexity, and user interaction—using specialized controllers to generate complex tasks requiring iterative reasoning and environmental interaction.
+For more detail, please refer to the comprehensive [***LongCat-Flash Technical Report***](https://huggingface.co/papers/2509.01322).
 ## Evaluation Results
 | **Benchmark** | **DeepSeek V3.1** | **Qwen3 MoE-2507** | **Kimi-K2** | **GPT-4.1** | **Claude4 Sonnet** | **Gemini2.5 Flash** | **LongCat-Flash** |
 ## Contact
+Please contact us at <a href="mailto:longcat-team@meituan.com">longcat-team@meituan.com</a> or open an issue if you have any questions.