Mingke977 commited on
Commit
19d41c8
·
verified ·
1 Parent(s): fac5ece

Add files using upload-large-folder tool

Browse files
Files changed (1) hide show
  1. README.md +378 -0
README.md ADDED
@@ -0,0 +1,378 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ ---
8
+ <div align="center">
9
+ <picture>
10
+ <img src="figures/joyai-logo.png" width="30%" alt="JoyAI-LLM Flash">
11
+ </picture>
12
+ </div>
13
+ <hr>
14
+
15
+ <div align="center" style="line-height: 1;">
16
+ <a href="https://huggingface.co/jdopensource" target="_blank"><img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-JD-ffc107?color=ffc107&logoColor=white"/></a>
17
+ <a href="https://huggingface.co/jdopensource/JoyAI-LLM-Flash/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Modified_MIT-f5de53?&color=f5de53"/></a>
18
+ </div>
19
+
20
+
21
+
22
+
23
+ ## 1. Model Introduction
24
+
25
+ JoyAI-LLM-Flash is a state-of-the-art medium-sized instruct language model with 3 billion activated parameters and 48 billion total parameters. JoyAI-LLM-Flash was pretrained on 20 trillion text tokens using Muon optimizer, followed by large-scale supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning (RL) across diverse environments. JoyAI-LLM-Flash achieves strong performance across frontier knowledge, reasoning, coding tasks and agentic capabilities.
26
+
27
+ ### Key Features
28
+
29
+ - Fiber Bundle RL: Introduces fiber bundle theory into reinforcement learning, proposing a novel optimization framework, FiberPO. This method is specifically designed to handle the challenges of large-scale and heterogeneous agent training, improving stability and robustness under complex data distributions.
30
+ - Training-Inference Collaboration: apply Muon optimizer with dense MTP, develop novel optimization techniques to resolve instabilities while scaling up, delivering 1.3× to 1.7× the throughput of the non-MTP version.
31
+ - Agentic Intelligence: designed for tool use, reasoning, and autonomous problem-solving.
32
+
33
+ ## 2. Model Summary
34
+
35
+ | | |
36
+ | :-----------------------------------------: | :----------------------: |
37
+ | **Architecture** | Mixture-of-Experts (MoE) |
38
+ | **Total Parameters** | 48B |
39
+ | **Activated Parameters** | 3B |
40
+ | **Number of Layers** (Dense layer included) | 40 |
41
+ | **Number of Dense Layers** | 1 |
42
+ | **Attention Hidden Dimension** | 2048 |
43
+ | **MoE Hidden Dimension** (per Expert) | 768 |
44
+ | **Number of Attention Heads** | 32 |
45
+ | **Number of Experts** | 256 |
46
+ | **Selected Experts per Token** | 8 |
47
+ | **Number of Shared Experts** | 1 |
48
+ | **Vocabulary Size** | 129K |
49
+ | **Context Length** | 128K |
50
+ | **Attention Mechanism** | MLA |
51
+ | **Activation Function** | SwiGLU |
52
+ | </div> | |
53
+
54
+
55
+ ## 3. Evaluation Results
56
+
57
+ <table>
58
+ <thead>
59
+ <tr>
60
+ <th align="center">Benchmark</th>
61
+ <th align="center"><sup>JoyAI-LLM Flash</sup></th>
62
+ <th align="center"><sup>Qwen3-30B-A3B-Instuct-2507</sup></th>
63
+ <th align="center"><sup>GLM-4.7-Flash<br>(Non-thinking)</sup></th>
64
+ </tr>
65
+ </thead>
66
+ <tbody>
67
+
68
+
69
+ <tr>
70
+ <td align="center" colspan=8><strong>Knowledge &amp; Alignment</strong></td>
71
+ </tr>
72
+ <tr>
73
+ <td align="center" style="vertical-align: middle">MMLU</td>
74
+ <td align="center" style="vertical-align: middle"><strong>89.50</strong></td>
75
+ <td align="center" style="vertical-align: middle">86.87</td>
76
+ <td align="center" style="vertical-align: middle">80.53</td>
77
+ </tr>
78
+ <tr>
79
+ <td align="center" style="vertical-align: middle">MMLU-Pro</td>
80
+ <td align="center" style="vertical-align: middle"><strong>81.02</strong></td>
81
+ <td align="center" style="vertical-align: middle">73.88</td>
82
+ <td align="center" style="vertical-align: middle">63.62</td>
83
+ </tr>
84
+ <tr>
85
+ <td align="center" style="vertical-align: middle">CMMLU</td>
86
+ <td align="center" style="vertical-align: middle"><strong>87.03</strong></td>
87
+ <td align="center" style="vertical-align: middle">85.88</td>
88
+ <td align="center" style="vertical-align: middle">75.85</td>
89
+ </tr>
90
+ <tr>
91
+ <td align="center" style="vertical-align: middle">GPQA-Diamond</td>
92
+ <td align="center" style="vertical-align: middle"><strong>74.43</strong></td>
93
+ <td align="center" style="vertical-align: middle">68.69</td>
94
+ <td align="center" style="vertical-align: middle">39.90</td>
95
+ </tr>
96
+ <tr>
97
+ <td align="center" style="vertical-align: middle">SuperGPQA</td>
98
+ <td align="center" style="vertical-align: middle"><strong>55.00</strong></td>
99
+ <td align="center" style="vertical-align: middle">52.00</td>
100
+ <td align="center" style="vertical-align: middle">32.00</td>
101
+ </tr>
102
+ <tr>
103
+ <td align="center" style="vertical-align: middle">LiveBench</td>
104
+ <td align="center" style="vertical-align: middle"><strong>72.90</strong></td>
105
+ <td align="center" style="vertical-align: middle">59.70</td>
106
+ <td align="center" style="vertical-align: middle">43.10</td>
107
+ </tr>
108
+ <tr>
109
+ <td align="center" style="vertical-align: middle">IFEval</td>
110
+ <td align="center" style="vertical-align: middle"><strong>86.69</strong></td>
111
+ <td align="center" style="vertical-align: middle">83.18</td>
112
+ <td align="center" style="vertical-align: middle">82.44</td>
113
+ </tr>
114
+ <tr>
115
+ <td align="center" style="vertical-align: middle">AlignBench</td>
116
+ <td align="center" style="vertical-align: middle"><strong>8.24</strong></td>
117
+ <td align="center" style="vertical-align: middle">8.07</td>
118
+ <td align="center" style="vertical-align: middle">6.85</td>
119
+ </tr>
120
+ <tr>
121
+ <td align="center" style="vertical-align: middle">HellaSwag</td>
122
+ <td align="center" style="vertical-align: middle"><strong>91.79</strong></td>
123
+ <td align="center" style="vertical-align: middle">89.90</td>
124
+ <td align="center" style="vertical-align: middle">60.84</td>
125
+ </tr>
126
+
127
+ <tr>
128
+ <td align="center" colspan=8><strong>Coding</strong></td>
129
+ </tr>
130
+ <tr>
131
+ <td align="center" style="vertical-align: middle">HumanEval</td>
132
+ <td align="center" style="vertical-align: middle"><strong>96.34</strong></td>
133
+ <td align="center" style="vertical-align: middle">95.12</td>
134
+ <td align="center" style="vertical-align: middle">74.39</td>
135
+ </tr>
136
+ <tr>
137
+ <td align="center" style="vertical-align: middle">LiveCodeBench</td>
138
+ <td align="center" style="vertical-align: middle"><strong>65.60</strong></td>
139
+ <td align="center" style="vertical-align: middle">39.71</td>
140
+ <td align="center" style="vertical-align: middle">27.43</td>
141
+ </tr>
142
+ <tr>
143
+ <td align="center" style="vertical-align: middle">SciCode</td>
144
+ <td align="center" style="vertical-align: middle"><strong>3.08/22.92</strong></td>
145
+ <td align="center" style="vertical-align: middle"><strong>3.08/22.92</strong></td>
146
+ <td align="center" style="vertical-align: middle">3.08/15.11</td>
147
+ </tr>
148
+ <tr>
149
+ <td align="center" colspan=8><strong>Mathematics</strong></td>
150
+ </tr>
151
+ <tr>
152
+ <td align="center" style="vertical-align: middle">GSM8K</td>
153
+ <td align="center" style="vertical-align: middle"><strong>95.83</strong></td>
154
+ <td align="center" style="vertical-align: middle">79.83</td>
155
+ <td align="center" style="vertical-align: middle">81.88</td>
156
+ </tr>
157
+ <tr>
158
+ <td align="center" style="vertical-align: middle">AIME2025</td>
159
+ <td align="center" style="vertical-align: middle"><strong>65.83</strong></td>
160
+ <td align="center" style="vertical-align: middle">62.08</td>
161
+ <td align="center" style="vertical-align: middle">24.17</td>
162
+ </tr>
163
+ <tr>
164
+ <td align="center" style="vertical-align: middle">MATH 500</td>
165
+ <td align="center" style="vertical-align: middle"><strong>97.10</strong></td>
166
+ <td align="center" style="vertical-align: middle">89.80</td>
167
+ <td align="center" style="vertical-align: middle">90.90</td>
168
+ </tr>
169
+
170
+ <tr>
171
+ <td align="center" colspan=8><strong>Agentic</strong></td>
172
+ </tr>
173
+ <tr>
174
+ <td align="center" style="vertical-align: middle">SWE-bench Verified</td>
175
+ <td align="center" style="vertical-align: middle"><strong>60.60</strong></td>
176
+ <td align="center" style="vertical-align: middle">24.44</td>
177
+ <td align="center" style="vertical-align: middle">51.60</td>
178
+ </tr>
179
+ <tr>
180
+ <td align="center" style="vertical-align: middle">Tau2-Retail</td>
181
+ <td align="center" style="vertical-align: middle"><strong>67.55</strong></td>
182
+ <td align="center" style="vertical-align: middle">53.51</td>
183
+ <td align="center" style="vertical-align: middle">62.28</td>
184
+ </tr>
185
+ <tr>
186
+ <td align="center" style="vertical-align: middle">Tau2-Airline</td>
187
+ <td align="center" style="vertical-align: middle"><strong>54.00</strong></td>
188
+ <td align="center" style="vertical-align: middle">32.00</td>
189
+ <td align="center" style="vertical-align: middle">52.00</td>
190
+ </tr>
191
+ <tr>
192
+ <td align="center" style="vertical-align: middle">Tau2-Telecom</td>
193
+ <td align="center" style="vertical-align: middle">79.83</td>
194
+ <td align="center" style="vertical-align: middle">4.39</td>
195
+ <td align="center" style="vertical-align: middle"><strong>88.60</strong></td>
196
+ </tr>
197
+
198
+ <tr>
199
+ <td align="center" colspan=8><strong>Long Context</strong></td>
200
+ </tr>
201
+ <tr>
202
+ <td align="center" style="vertical-align: middle">RULER</td>
203
+ <td align="center" style="vertical-align: middle"><strong>95.60</strong></td>
204
+ <td align="center" style="vertical-align: middle">89.66</td>
205
+ <td align="center" style="vertical-align: middle">56.12</td>
206
+ </tr>
207
+ </tbody>
208
+ </table>
209
+
210
+
211
+ ## 4. Deployment
212
+
213
+ > [!Note]
214
+ > You can access JoyAI-LLM Flash API on https://docs.jdcloud.com/cn/jdaip/chat and we provide OpenAI/Anthropic-compatible API for you.
215
+ > Currently, JoyAI-LLM-Flash-Block-INT8 is recommended to run on the following inference engines:
216
+
217
+ * SGLang
218
+
219
+ Deployment examples can be found in the [Model Deployment Guide](docs/deploy_guidance.md).
220
+
221
+
222
+
223
+ ## 5. Model Usage
224
+
225
+ The usage demos below demonstrate how to call our official API.
226
+
227
+ For third-party APIs deployed with vLLM or SGLang, please note that:
228
+
229
+ > [!Note] Recommended sampling parameters: `temperature=0.6`, `top_p=1.0`
230
+
231
+ ### Chat Completion
232
+
233
+ This is a simple chat completion script which shows how to call JoyAI-Flash API.
234
+
235
+ ```python
236
+ from openai import OpenAI
237
+
238
+ client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")
239
+
240
+
241
+ def simple_chat(client: OpenAI):
242
+ messages = [
243
+ {
244
+ "role": "user",
245
+ "content": [
246
+ {
247
+ "type": "text",
248
+ "text": "which one is bigger, 9.11 or 9.9? think carefully.",
249
+ }
250
+ ],
251
+ },
252
+ ]
253
+ model_name = client.models.list().data[0].id
254
+ response = client.chat.completions.create(
255
+ model=model_name, messages=messages, stream=False, max_tokens=4096
256
+ )
257
+ print(f"response: {response.choices[0].message.content}")
258
+
259
+
260
+ if __name__ == "__main__":
261
+ simple_chat(client)
262
+ ```
263
+
264
+
265
+ ### Tool call Completion
266
+
267
+ This is a simple toll call completion script which shows how to call JoyAI-Flash API.
268
+
269
+ ```python
270
+ import json
271
+
272
+ from openai import OpenAI
273
+
274
+ client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")
275
+
276
+
277
+ def my_calculator(expression: str) -> str:
278
+ return str(eval(expression))
279
+
280
+
281
+ def rewrite(expression: str) -> str:
282
+ return str(expression)
283
+
284
+
285
+ def simple_tool_call(client: OpenAI):
286
+ messages = [
287
+ {
288
+ "role": "user",
289
+ "content": [
290
+ {
291
+ "type": "text",
292
+ "text": "use my functions to compute the results for the equations: 6+1",
293
+ },
294
+ ],
295
+ },
296
+ ]
297
+ tools = [
298
+ {
299
+ "type": "function",
300
+ "function": {
301
+ "name": "my_calculator",
302
+ "description": "A calculator that can evaluate a mathematical equation and compute its results.",
303
+ "parameters": {
304
+ "type": "object",
305
+ "properties": {
306
+ "expression": {
307
+ "type": "string",
308
+ "description": "The mathematical expression to evaluate.",
309
+ },
310
+ },
311
+ "required": ["expression"],
312
+ },
313
+ },
314
+ },
315
+ {
316
+ "type": "function",
317
+ "function": {
318
+ "name": "rewrite",
319
+ "description": "Rewrite a given text for improved clarity",
320
+ "parameters": {
321
+ "type": "object",
322
+ "properties": {
323
+ "text": {
324
+ "type": "string",
325
+ "description": "The input text to rewrite",
326
+ }
327
+ },
328
+ },
329
+ },
330
+ },
331
+ ]
332
+ model_name = client.models.list().data[0].id
333
+ response = client.chat.completions.create(
334
+ model=model_name,
335
+ messages=messages,
336
+ temperature=1.0,
337
+ max_tokens=1024,
338
+ tools=tools,
339
+ tool_choice="auto",
340
+ )
341
+ tool_calls = response.choices[0].message.tool_calls
342
+
343
+ results = []
344
+ for tool_call in tool_calls:
345
+ function_name = tool_call.function.name
346
+ function_args = tool_call.function.arguments
347
+ if function_name == "my_calculator":
348
+ result = my_calculator(**json.loads(function_args))
349
+ results.append(result)
350
+ messages.append({"role": "assistant", "tool_calls": tool_calls})
351
+ for tool_call, result in zip(tool_calls, results):
352
+ messages.append(
353
+ {
354
+ "role": "tool",
355
+ "tool_call_id": tool_call.id,
356
+ "name": tool_call.function.name,
357
+ "content": result,
358
+ }
359
+ )
360
+ response = client.chat.completions.create(
361
+ model=model_name,
362
+ messages=messages,
363
+ temperature=1.0,
364
+ max_tokens=1024,
365
+ )
366
+ print(response.choices[0].message.content)
367
+
368
+
369
+ if __name__ == "__main__":
370
+ simple_tool_call(client)
371
+
372
+ ```
373
+
374
+ ---
375
+
376
+ ## 6. License
377
+
378
+ Both the code repository and the model weights are released under the [Modified MIT License](LICENSE).