RangiLyu commited on
Commit
d503f48
·
verified ·
1 Parent(s): ff2baaa

update readme

Browse files
.gitattributes CHANGED
@@ -34,3 +34,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ figs/efficiency.jpg filter=lfs diff=lfs merge=lfs -text
38
+ figs/performance.png filter=lfs diff=lfs merge=lfs -text
39
+ figs/title.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,9 +1,396 @@
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
 
4
  pipeline_tag: image-text-to-text
5
  ---
6
 
7
- # InternS2Preview
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
- ![20260408-154223.jpg](https://picui.ogmua.cn/s1/2026/04/08/69d60695b0db7.webp)
 
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
+ license_link: https://huggingface.co/internlm/Intern-S2-Preview-FP8/blob/main/LICENSE
5
  pipeline_tag: image-text-to-text
6
  ---
7
 
8
+ ## Intern-S2-Preview-FP8
9
+
10
+ <div align="center">
11
+ <img src="./figs/title.png" />
12
+
13
+ <div>&nbsp;</div>
14
+
15
+ [💻Github Repo](https://github.com/InternLM/Intern-S1) • [🤗Model Collections](https://huggingface.co/collections/internlm/intern-s2) • [💬Online Chat](https://chat.intern-ai.org.cn/)
16
+
17
+ </div>
18
+
19
+ <p align="center">
20
+ 👋 join us on <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> and <a href="https://cdn.vansin.top/intern-s1.jpg" target="_blank">WeChat</a>
21
+ </p>
22
+
23
+
24
+
25
+ ## Introduction
26
+
27
+ We introduce **Intern-S2-Preview**, an efficient 35B scientific multimodal foundation model continued pre-trained from Qwen3.5. Beyond conventional parameter and data scaling, Intern-S2-Preview explores **task scaling**: increasing the difficulty, diversity, and coverage of scientific tasks to further unlock model capabilities.
28
+
29
+ By extending professional scientific tasks into a full-chain training pipeline from pre-training to reinforcement learning, Intern-S2-Preview achieves performance comparable to the trillion-scale Intern-S1-Pro on multiple core professional scientific tasks, while using only 35B parameters. At the same time, it maintains strong general reasoning, multimodal understanding, coding, and agent capabilities.
30
+
31
+ ### Features
32
+
33
+ - **Scientific task scaling with full-chain training.** Intern-S2-Preview scales hundreds of professional scientific tasks from pre-training to RL, enabling strong performance across multiple specialized domains at only 35B parameters. It further strengthens spatial modeling for small-molecule structures and introduces real-valued prediction modules, making it the first open-source model with both material crystal structure generation capability and strong general capabilities.
34
+
35
+ - **Enhanced agent capabilities for scientific workflows.** Intern-S2-Preview significantly improves agentic abilities over the previous generation, achieving strong results on multiple scientific agent benchmarks.
36
+
37
+ - **Efficient RL reasoning with MTP and CoT compression.** During RL, Intern-S2-Preview adopts shared-weight MTP with KL loss to reduce the mismatch between training and inference behavior, substantially improving MTP accept rate and token generation speed. It also introduces CoT compression techniques to shorten responses while preserving strong reasoning capability, achieving improvements in both performance and efficiency.
38
+
39
+ <figure>
40
+ <img src="./figs/efficiency.jpg" alt="efficient RL reasoning with MTP and CoT compression">
41
+ <figcaption>Fig1: Reasoning Efficiency on Complex Math Benchmarks. Accuracy vs. Average Response Length. Intern-S2-Preview (red star) significantly outperforms trillion-scale Intern-S1-Pro (red circle), and achieving higher accuracy with better token efficiency among medium-size models.</figcaption>
42
+ </figure>
43
+
44
+ ### Performance
45
+
46
+ We evaluate the Intern-S2-Preview on various benchmarks, including general datasets and scientific datasets. We report the performance comparison with the recent VLMs and LLMs below.
47
+
48
+ ![performance](./figs/performance.png)
49
+
50
+
51
+ > **Note**: <u>Underline</u> means the best performance among open-sourced models, **Bold** indicates the best performance among all models.
52
+
53
+ We use the [OpenCompass](https://github.com/open-compass/OpenCompass/) and [VLMEvalKit](https://github.com/open-compass/vlmevalkit) to evaluate all models. For text reasoning benchmarks, Intern-S2-Preview is evaluated with a maximum inference length of 128K tokens, while for multimodal benchmarks, it is evaluated with a maximum inference length of 64K tokens.
54
+
55
+
56
+ ## Quick Start
57
+
58
+ ### Sampling Parameters
59
+
60
+ We recommend using the following hyperparameters to ensure better results
61
+
62
+ ```python
63
+ top_p = 0.95
64
+ top_k = 50
65
+ min_p = 0.0
66
+ temperature = 0.8
67
+ ```
68
+
69
+ ### Serving
70
+
71
+ Intern-S2-Preview can be deployed using any of the following LLM inference frameworks:
72
+
73
+ - LMDeploy
74
+ - vLLM
75
+ - SGLang
76
+
77
+ Detailed deployment examples for these frameworks are available in the [Model Deployment Guide](./deployment_guide.md).
78
+
79
+
80
+ ## Advanced Usage
81
+
82
+ ### Tool Calling
83
+
84
+ Tool Calling lets the model extend its capabilities by invoking external tools and APIs. The example below shows how to use it to fetch the latest weather forecast via an OpenAI-compatible API (based on lmdeploy api server).
85
+
86
+ ```python
87
+
88
+
89
+ from openai import OpenAI
90
+ import json
91
+
92
+
93
+ def get_current_temperature(location: str, unit: str = "celsius"):
94
+ """Get current temperature at a location.
95
+
96
+ Args:
97
+ location: The location to get the temperature for, in the format "City, State, Country".
98
+ unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])
99
+
100
+ Returns:
101
+ the temperature, the location, and the unit in a dict
102
+ """
103
+ return {
104
+ "temperature": 26.1,
105
+ "location": location,
106
+ "unit": unit,
107
+ }
108
+
109
+
110
+ def get_temperature_date(location: str, date: str, unit: str = "celsius"):
111
+ """Get temperature at a location and date.
112
+
113
+ Args:
114
+ location: The location to get the temperature for, in the format "City, State, Country".
115
+ date: The date to get the temperature for, in the format "Year-Month-Day".
116
+ unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])
117
+
118
+ Returns:
119
+ the temperature, the location, the date and the unit in a dict
120
+ """
121
+ return {
122
+ "temperature": 25.9,
123
+ "location": location,
124
+ "date": date,
125
+ "unit": unit,
126
+ }
127
+
128
+ def get_function_by_name(name):
129
+ if name == "get_current_temperature":
130
+ return get_current_temperature
131
+ if name == "get_temperature_date":
132
+ return get_temperature_date
133
+
134
+ tools = [{
135
+ 'type': 'function',
136
+ 'function': {
137
+ 'name': 'get_current_temperature',
138
+ 'description': 'Get current temperature at a location.',
139
+ 'parameters': {
140
+ 'type': 'object',
141
+ 'properties': {
142
+ 'location': {
143
+ 'type': 'string',
144
+ 'description': 'The location to get the temperature for, in the format \'City, State, Country\'.'
145
+ },
146
+ 'unit': {
147
+ 'type': 'string',
148
+ 'enum': [
149
+ 'celsius',
150
+ 'fahrenheit'
151
+ ],
152
+ 'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'
153
+ }
154
+ },
155
+ 'required': [
156
+ 'location'
157
+ ]
158
+ }
159
+ }
160
+ }, {
161
+ 'type': 'function',
162
+ 'function': {
163
+ 'name': 'get_temperature_date',
164
+ 'description': 'Get temperature at a location and date.',
165
+ 'parameters': {
166
+ 'type': 'object',
167
+ 'properties': {
168
+ 'location': {
169
+ 'type': 'string',
170
+ 'description': 'The location to get the temperature for, in the format \'City, State, Country\'.'
171
+ },
172
+ 'date': {
173
+ 'type': 'string',
174
+ 'description': 'The date to get the temperature for, in the format \'Year-Month-Day\'.'
175
+ },
176
+ 'unit': {
177
+ 'type': 'string',
178
+ 'enum': [
179
+ 'celsius',
180
+ 'fahrenheit'
181
+ ],
182
+ 'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'
183
+ }
184
+ },
185
+ 'required': [
186
+ 'location',
187
+ 'date'
188
+ ]
189
+ }
190
+ }
191
+ }]
192
+
193
+
194
+
195
+ messages = [
196
+ {'role': 'user', 'content': 'Today is 2024-11-14, What\'s the temperature in San Francisco now? How about tomorrow?'}
197
+ ]
198
+
199
+ openai_api_key = "EMPTY"
200
+ openai_api_base = "http://0.0.0.0:23333/v1"
201
+ client = OpenAI(
202
+ api_key=openai_api_key,
203
+ base_url=openai_api_base,
204
+ )
205
+ model_name = client.models.list().data[0].id
206
+ response = client.chat.completions.create(
207
+ model=model_name,
208
+ messages=messages,
209
+ max_tokens=32768,
210
+ temperature=0.8,
211
+ top_p=0.95,
212
+ extra_body=dict(spaces_between_special_tokens=False),
213
+ tools=tools)
214
+ print(response.choices[0].message)
215
+ messages.append(response.choices[0].message)
216
+
217
+ for tool_call in response.choices[0].message.tool_calls:
218
+ tool_call_args = json.loads(tool_call.function.arguments)
219
+ tool_call_result = get_function_by_name(tool_call.function.name)(**tool_call_args)
220
+ tool_call_result = json.dumps(tool_call_result, ensure_ascii=False)
221
+ messages.append({
222
+ 'role': 'tool',
223
+ 'name': tool_call.function.name,
224
+ 'content': tool_call_result,
225
+ 'tool_call_id': tool_call.id
226
+ })
227
+
228
+ response = client.chat.completions.create(
229
+ model=model_name,
230
+ messages=messages,
231
+ temperature=0.8,
232
+ top_p=0.95,
233
+ extra_body=dict(spaces_between_special_tokens=False),
234
+ tools=tools)
235
+ print(response.choices[0].message)
236
+ ```
237
+
238
+ ### Switching Between Thinking and Non-Thinking Modes
239
+
240
+ Intern-S2-Preview enables thinking mode by default, enhancing the model's reasoning capabilities to generate higher-quality responses. This feature can be disabled by setting `enable_thinking=False` in `tokenizer.apply_chat_template`
241
+
242
+ ```python
243
+ text = tokenizer.apply_chat_template(
244
+ messages,
245
+ tokenize=False,
246
+ add_generation_prompt=True,
247
+ enable_thinking=False # think mode indicator
248
+ )
249
+ ```
250
+
251
+ When serving Intern-S2-Preview models, you can dynamically control the thinking mode by adjusting the `enable_thinking` parameter in your requests.
252
+
253
+ ```python
254
+ from openai import OpenAI
255
+ import json
256
+
257
+ messages = [
258
+ {
259
+ 'role': 'user',
260
+ 'content': 'who are you'
261
+ }, {
262
+ 'role': 'assistant',
263
+ 'content': 'I am an AI'
264
+ }, {
265
+ 'role': 'user',
266
+ 'content': 'AGI is?'
267
+ }]
268
+
269
+ openai_api_key = "EMPTY"
270
+ openai_api_base = "http://0.0.0.0:23333/v1"
271
+ client = OpenAI(
272
+ api_key=openai_api_key,
273
+ base_url=openai_api_base,
274
+ )
275
+ model_name = client.models.list().data[0].id
276
+
277
+ response = client.chat.completions.create(
278
+ model=model_name,
279
+ messages=messages,
280
+ temperature=0.8,
281
+ top_p=0.95,
282
+ max_tokens=2048,
283
+ extra_body={
284
+ "chat_template_kwargs": {"enable_thinking": False}
285
+ }
286
+ )
287
+ print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))
288
+ ```
289
+
290
+ > Note: We do not recommend disabling thinking mode for agentic tasks.
291
+
292
+
293
+ ## Agent Integration
294
+
295
+ Intern-S2-Preview can be plugged into agent frameworks in two ways: connecting to a **self-hosted deployment**, or calling the **official InternLM API**. Below we cover both, with examples for agent frameworks (OpenClaw, Hermes, etc.) and for Claude Code.
296
+
297
+ ### 1. Self-hosted Deployment (LMDeploy as an example)
298
+
299
+ First, serve the model with LMDeploy following the [Model Deployment Guide](./deployment_guide.md). The example below assumes the server is running at `http://0.0.0.0:23333`.
300
+
301
+ #### Connecting Agent Frameworks
302
+
303
+ Most agent frameworks (OpenClaw, Hermes, etc.) accept an OpenAI-compatible endpoint. Point them at the LMDeploy server base url `http://0.0.0.0:23333/v1`.
304
+
305
+ You can check the connection with the following command:
306
+
307
+ ```bash
308
+ curl http://0.0.0.0:23333/v1/chat/completions \
309
+ -H "Content-Type: application/json" \
310
+ -H "Authorization: Bearer EMPTY" \
311
+ -d '{
312
+ "model": "internlm/Intern-S2-Preview",
313
+ "messages": [
314
+ {"role": "user", "content": "Hello"}
315
+ ],
316
+ "temperature": 0.8,
317
+ "top_p": 0.95
318
+ }'
319
+ ```
320
+
321
+ Or you can configure your agent framework with the environment variables
322
+
323
+ ```bash
324
+ export OPENAI_API_KEY=EMPTY
325
+ export OPENAI_BASE_URL=http://0.0.0.0:23333/v1
326
+ export OPENAI_MODEL=internlm/Intern-S2-Preview
327
+ ```
328
+
329
+ Remember to launch LMDeploy with `--tool-call-parser interns2-preview` so tool calls are parsed correctly.
330
+
331
+ #### Connecting Claude Code
332
+
333
+ LMDeploy exposes an Anthropic-compatible `/v1/messages` endpoint that Claude Code can talk to directly. Add the following to `~/.claude/settings.json`:
334
+
335
+ ```json
336
+ {
337
+ "env": {
338
+ "ANTHROPIC_BASE_URL": "http://127.0.0.1:23333",
339
+ "ANTHROPIC_AUTH_TOKEN": "dummy",
340
+ "ANTHROPIC_MODEL": "internlm/Intern-S2-Preview",
341
+ "ANTHROPIC_CUSTOM_MODEL_OPTION": "internlm/Intern-S2-Preview"
342
+ }
343
+ }
344
+ ```
345
+
346
+ For a full walkthrough (curl verification, model routing, troubleshooting), see [LMDeploy × Claude Code](https://lmdeploy.readthedocs.io/en/latest/intergration/claude_code.html).
347
+
348
+ ### 2. Official Intern API
349
+
350
+ If you do not want to self-host, you can use the official Intern API. Register at [internlm.intern-ai.org.cn](https://internlm.intern-ai.org.cn/) and create an API token (`sk-xxxxxxxx`).
351
+
352
+ #### Connecting Agent Frameworks
353
+
354
+ The service is OpenAI-compatible, so any agent framework works. You can set the base url to `https://chat.intern-ai.org.cn/api/v1` and the model name to `intern-s2-preview` in the cli or config file.
355
+
356
+ You can check the connection with the following command:
357
+
358
+ ```bash
359
+ curl https://chat.intern-ai.org.cn/api/v1/chat/completions \
360
+ -H "Content-Type: application/json" \
361
+ -H "Authorization: Bearer sk-xxxxxxxx" \
362
+ -d '{
363
+ "model": "intern-s2-preview",
364
+ "messages": [
365
+ {"role": "user", "content": "Hello"}
366
+ ],
367
+ "temperature": 0.8,
368
+ "top_p": 0.95
369
+ }'
370
+ ```
371
+
372
+ Refer to the [Intern API documentation](https://internlm.intern-ai.org.cn/api/document?lang=en) for the current endpoint, available model names, rate limits, and advanced parameters.
373
+
374
+ #### Connecting Claude Code
375
+
376
+ Claude Code can route to the official Intern API by pointing `ANTHROPIC_BASE_URL` at the Intern Anthropic-compatible gateway:
377
+
378
+ ```json
379
+ {
380
+ "env": {
381
+ "ANTHROPIC_BASE_URL": "http://chat.staging.intern-ai.org.cn",
382
+ "ANTHROPIC_AUTH_TOKEN": "your-api-token",
383
+ "ANTHROPIC_MODEL": "intern-s2-preview",
384
+ "ANTHROPIC_SMALL_FAST_MODEL": "intern-s2-preview"
385
+ }
386
+ }
387
+ ```
388
+
389
+ Then start claude code with the following command:
390
+
391
+ ```bash
392
+ claude --model intern-s2-preview
393
+ ```
394
+
395
+ For step-by-step setup, see [Intern API × Claude Code Integration](https://internlm.intern-ai.org.cn/api/document?lang=en).
396
 
 
deployment_guide.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Intern-S2-Preview Deployment Guide
2
+
3
+ The Intern-S2-Preview release is a 35B-A3B model stored in bfloat16 weight format. This guide provides deployment examples for the following configurations:
4
+
5
+ - MTP speculative decoding (Recommended)
6
+ - Basic serving without MTP
7
+ - Long-context inference with YaRN RoPE configuration
8
+
9
+ > NOTE: The commands below are reference configurations. Inference frameworks are under active development, so use the latest framework documentation and your local validation results when tuning production deployments.
10
+
11
+ ## LMDeploy
12
+
13
+ Use the latest LMDeploy (>=0.13.0) with Intern-S2-Preview support.
14
+
15
+ - Serving With MTP (Recommended)
16
+
17
+ ```bash
18
+ lmdeploy serve api_server \
19
+ internlm/Intern-S2-Preview \
20
+ --trust-remote-code \
21
+ --backend pytorch \
22
+ --tp 2 \
23
+ --reasoning-parser default \
24
+ --tool-call-parser interns2-preview \
25
+ --speculative-algorithm qwen3_5_mtp \
26
+ --speculative-num-draft-tokens 4 \
27
+ --max-batch-size 256
28
+ ```
29
+
30
+ - Basic Serving Without MTP
31
+
32
+ ```bash
33
+ lmdeploy serve api_server \
34
+ internlm/Intern-S2-Preview \
35
+ --trust-remote-code \
36
+ --backend pytorch \
37
+ --tp 2 \
38
+ --reasoning-parser default \
39
+ --tool-call-parser interns2-preview
40
+ ```
41
+
42
+ - Long-Context Serving
43
+
44
+ For long-context inference, configure both `--session-len` and YaRN RoPE parameters. The following example uses a 512k context length:
45
+
46
+ ```bash
47
+ lmdeploy serve api_server \
48
+ internlm/Intern-S2-Preview \
49
+ --trust-remote-code \
50
+ --tp 2 \
51
+ --backend pytorch \
52
+ --reasoning-parser default \
53
+ --tool-call-parser interns2-preview \
54
+ --session-len 512000 \
55
+ --max-batch-size 64 \
56
+ --hf-overrides '{"text_config": {"rope_parameters": {"mrope_interleaved": true, "mrope_section": [11, 11, 10], "rope_type": "yarn", "rope_theta": 10000000, "partial_rotary_factor": 0.25, "factor": 4.0, "original_max_position_embeddings": 262144}}}'
57
+ ```
58
+
59
+ ## vLLM
60
+
61
+ Use the latest vLLM Docker image or source build with Intern-S2-Preview support.
62
+
63
+ - Serving With MTP (Recommended)
64
+
65
+ ```bash
66
+ vllm serve internlm/Intern-S2-Preview \
67
+ --trust-remote-code \
68
+ --tensor-parallel-size 2 \
69
+ --reasoning-parser qwen3 \
70
+ --enable-auto-tool-choice \
71
+ --tool-call-parser qwen3_coder \
72
+ --speculative-config '{"method":"mtp","num_speculative_tokens":4}'
73
+ ```
74
+
75
+ - Basic Serving Without MTP
76
+
77
+ ```bash
78
+ vllm serve internlm/Intern-S2-Preview \
79
+ --trust-remote-code \
80
+ --tensor-parallel-size 2 \
81
+ --reasoning-parser qwen3 \
82
+ --enable-auto-tool-choice \
83
+ --tool-call-parser qwen3_coder
84
+ ```
85
+
86
+ ## SGLang
87
+
88
+ Use the latest SGLang Docker image or source build with Intern-S2-Preview support.
89
+
90
+ - Serving With MTP (Recommended)
91
+
92
+ ```bash
93
+ SGLANG_ENABLE_SPEC_V2=1 \
94
+ python3 -m sglang.launch_server \
95
+ --model-path internLM/Intern-S2-Preview \
96
+ --trust-remote-code \
97
+ --tp-size 2 \
98
+ --reasoning-parser qwen3 \
99
+ --tool-call-parser qwen3_coder \
100
+ --mamba-scheduler-strategy extra_buffer \
101
+ --speculative-algo 'NEXTN' \
102
+ --speculative-eagle-topk 1 \
103
+ --speculative-num-steps 3 \
104
+ --speculative-num-draft-tokens 4
105
+ ```
106
+
107
+ - Basic Serving Without MTP
108
+
109
+ ```bash
110
+ python3 -m sglang.launch_server \
111
+ --model-path internlm/Intern-S2-Preview \
112
+ --trust-remote-code \
113
+ --tp-size 2 \
114
+ --reasoning-parser qwen3 \
115
+ --tool-call-parser qwen3_coder
116
+ ```
figs/efficiency.jpg ADDED

Git LFS Details

  • SHA256: 39b53166ece4ceda370e99c9d864f8150b98159747cd84c3d538588e3934c859
  • Pointer size: 131 Bytes
  • Size of remote file: 346 kB
figs/performance.png ADDED

Git LFS Details

  • SHA256: 85ec61e9af588fb1f03774c79517b6052e93e63727e92b24bb5d868d8e420d03
  • Pointer size: 132 Bytes
  • Size of remote file: 1.1 MB
figs/title.png ADDED

Git LFS Details

  • SHA256: 1e0080637b1009715c78ad8fb9b00f2355282b79e9e332100b8943f1a17eb33c
  • Pointer size: 132 Bytes
  • Size of remote file: 1.3 MB