Improve model card: Update `library_name`, add relevant tags, and clarify links

This PR enhances the model card by:

- Setting the `library_name` to `transformers` in the metadata, which enables the automated "how to use" widget on the Hub. This is confirmed by the `auto_map` entries in `config.json`.
- Removing the redundant `transformers` tag and adding more descriptive tags such as `moe`, `mixture-of-experts`, and `code-generation` to improve model discoverability, based on the model's architecture and capabilities.
- Adding explicit links to the paper, GitHub repository, and project page at the top of the model card for easier access.
- Updating the internal paper link within the "Model Introduction" section to point to the official Hugging Face paper page (`https://huggingface.co/papers/2509.01322`).
- Adding a note in the "Quick Start" section to inform users about the necessity of `trust_remote_code=True` when loading the model with `transformers`.

Files changed (1) hide show

README.md +9 -5

README.md CHANGED Viewed

@@ -1,13 +1,17 @@
 ---
 license: mit
-library_name: LongCat-Flash-Chat
 pipeline_tag: text-generation
 tags:
-- transformers
 ---
 # LongCat-Flash-Chat
 <div align="center">
   <img src="https://raw.githubusercontent.com/meituan-longcat/LongCat-Flash-Chat/main/figures/longcat_logo.svg"
        width="300"
@@ -62,7 +66,7 @@ Effectively and efficiently scaling model size remains a key challenge in strate
 #### 🌟 Multi-Stage Training Pipeline for Agentic Capability
 Through a meticulously designed pipeline, LongCat-Flash is endowed with advanced agentic behaviors. Initial efforts focus on constructing a more suitable base model for agentic post-training, where we design a two-stage pretraining data fusion strategy to concentrate reasoning-intensive domain data. During mid-training, we enhance reasoning and coding capabilities while extending the context length to 128k to meet agentic post-training requirements. Building on this advanced base model, we proceed with a multi-stage post-training. Recognizing the scarcity of high-quality, high-difficulty training problems for agentic tasks, we design a multi-agent synthesis framework that defines task difficulty across three axes, i.e., information processing, tool-set complexity, and user interaction—using specialized controllers to generate complex tasks requiring iterative reasoning and environmental interaction.
-For more detail, please refer to the comprehensive [***LongCat-Flash Technical Report***](https://github.com/meituan-longcat/LongCat-Flash-Chat/blob/main/tech_report.pdf).
 ## Evaluation Results
 | **Benchmark** | **DeepSeek V3.1** | **Qwen3 MoE-2507** | **Kimi-K2** | **GPT-4.1** | **Claude4 Sonnet** | **Gemini2.5 Flash** | **LongCat-Flash** |
@@ -113,6 +117,7 @@ Note:
 * DeepSeek-V3.1, Qwen3-235B-A22B, Gemini2.5-Flash, and Claude4-Sonnet are evaluated under their non-thinking mode.
 ## Quick Start
 ### Chat Template
 The details of our chat template are provided in the `tokenizer_config.json` file. Below are some examples.
@@ -220,5 +225,4 @@ We kindly encourage citation of our work if you find it useful.
 ## Contact
-Please contact us at <a href="mailto:longcat-team@meituan.com">longcat-team@meituan.com</a> or open an issue if you have any questions.

 ---
+library_name: transformers
 license: mit
 pipeline_tag: text-generation
 tags:
+- moe
+- mixture-of-experts
+- code-generation
 ---
 # LongCat-Flash-Chat
+[Paper](https://huggingface.co/papers/2509.01322) - [Code](https://github.com/meituan-longcat/LongCat-Flash-Chat) - [Project Page](https://longcat.ai)
 <div align="center">
   <img src="https://raw.githubusercontent.com/meituan-longcat/LongCat-Flash-Chat/main/figures/longcat_logo.svg"
        width="300"
 #### 🌟 Multi-Stage Training Pipeline for Agentic Capability
 Through a meticulously designed pipeline, LongCat-Flash is endowed with advanced agentic behaviors. Initial efforts focus on constructing a more suitable base model for agentic post-training, where we design a two-stage pretraining data fusion strategy to concentrate reasoning-intensive domain data. During mid-training, we enhance reasoning and coding capabilities while extending the context length to 128k to meet agentic post-training requirements. Building on this advanced base model, we proceed with a multi-stage post-training. Recognizing the scarcity of high-quality, high-difficulty training problems for agentic tasks, we design a multi-agent synthesis framework that defines task difficulty across three axes, i.e., information processing, tool-set complexity, and user interaction—using specialized controllers to generate complex tasks requiring iterative reasoning and environmental interaction.
+For more detail, please refer to the comprehensive [***LongCat-Flash Technical Report***](https://huggingface.co/papers/2509.01322).
 ## Evaluation Results
 | **Benchmark** | **DeepSeek V3.1** | **Qwen3 MoE-2507** | **Kimi-K2** | **GPT-4.1** | **Claude4 Sonnet** | **Gemini2.5 Flash** | **LongCat-Flash** |
 * DeepSeek-V3.1, Qwen3-235B-A22B, Gemini2.5-Flash, and Claude4-Sonnet are evaluated under their non-thinking mode.
 ## Quick Start
+To use this model with the Hugging Face `transformers` library, you need to ensure `trust_remote_code=True` is set due to custom architectural components.
 ### Chat Template
 The details of our chat template are provided in the `tokenizer_config.json` file. Below are some examples.
 ## Contact
+Please contact us at <a href="mailto:longcat-team@meituan.com">longcat-team@meituan.com</a> or open an issue if you have any questions.