Prince-1 commited on
Commit
7934f72
·
verified ·
1 Parent(s): 3213273

Build the rkllm format of model Osmosis-Mcp

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. Osmosis-Mcp.rkllm +3 -0
  3. README.md +55 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Osmosis-Mcp.rkllm filter=lfs diff=lfs merge=lfs -text
Osmosis-Mcp.rkllm ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9031c0a378f84cef3365efcb83371999e4159ef5c78e36153ab1ba106d21ff20
3
+ size 8855189526
README.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: rkllm
4
+ tags:
5
+ - rkllm
6
+ - rockchip
7
+ - rk3588
8
+ - qwen3
9
+ base_model: osmosis-ai/osmosis-mcp-4b
10
+ base_model_relation: quantized
11
+ ---
12
+ ### Overview
13
+
14
+ Osmosis-MCP-4B is based on the Qwen3-4B model, fine-tuned with reinforcement learning to excel at multi step MCP-style tool usage.
15
+
16
+ We trained Osmosis-MCP-4B using a custom curriculum of **multi-turn, tool-reliant prompts** that mimic real-world use cases — for example:
17
+
18
+ > *"Given the weather in San Francisco, what are the top hiking locations?"*
19
+
20
+ In addition, we provide a list of deterministic MCP like functions and mock server side behavior for the model to call and use.
21
+
22
+ This requires the model to reason through multiple tool invocations (e.g., weather → location ranker), and choose tools over intuition when applicable.
23
+
24
+ ---
25
+
26
+ ### Training Approach
27
+
28
+ Our training pipeline leverages:
29
+
30
+ - [**Dr. GRPO**](https://arxiv.org/abs/2503.20783) for stable and sample-efficient reinforcement learning.
31
+ - **Synthetic multi-step MCP interactions** with strong tool chaining behavior, generated using our internal data engine.
32
+ - **SGLang + VeRL** for efficient multi-turn rollout environments, built on top of Qwen3-4B for its function-calling capabilities.
33
+
34
+ Through this training methodology, we observed a notable behavioral shift: the model **prefers invoking tools** when appropriate, instead of relying solely on pre-trained intuition — a key milestone for MCP-native agents.
35
+
36
+ ---
37
+
38
+ ### Why This Matters
39
+
40
+ MCP is fast becoming the **open standard for tool-augmented AI agents**. However:
41
+
42
+ - Most top-performing models (e.g., Claude 3.7 Sonnet, Gemini 2.5 Pro) are closed.
43
+ - Tool sprawl across clients and servers creates complexity.
44
+ - Open models often lack the training to effectively **use tools** at all.
45
+
46
+ <!-- Osmosis-MCP-4B addresses all three — it’s small, powerful, and practical.
47
+
48
+ INFO: Setting chat_template to " \n<|im_start|>user\n[content]<|im_end|>\n<|im_start|>assistant\n"
49
+ INFO: Setting token_id of eos to 151645
50
+ INFO: Setting token_id of pad to 151643
51
+ INFO: Setting token_id of bos to 151643
52
+ INFO: Setting add_bos_token to False -->
53
+
54
+
55
+ ---