Build the rkllm format of model Osmosis-Mcp

Browse files

Files changed (3) hide show

.gitattributes +1 -0
Osmosis-Mcp.rkllm +3 -0
README.md +55 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+Osmosis-Mcp.rkllm filter=lfs diff=lfs merge=lfs -text

Osmosis-Mcp.rkllm ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9031c0a378f84cef3365efcb83371999e4159ef5c78e36153ab1ba106d21ff20
+size 8855189526

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+license: apache-2.0
+library_name: rkllm
+tags:
+- rkllm
+- rockchip
+- rk3588
+- qwen3
+base_model: osmosis-ai/osmosis-mcp-4b
+base_model_relation: quantized
+---
+### Overview
+Osmosis-MCP-4B is based on the Qwen3-4B model, fine-tuned with reinforcement learning to excel at multi step MCP-style tool usage.
+We trained Osmosis-MCP-4B using a custom curriculum of **multi-turn, tool-reliant prompts** that mimic real-world use cases — for example:
+> *"Given the weather in San Francisco, what are the top hiking locations?"*
+In addition, we provide a list of deterministic MCP like functions and mock server side behavior for the model to call and use.
+This requires the model to reason through multiple tool invocations (e.g., weather → location ranker), and choose tools over intuition when applicable.
+---
+### Training Approach
+Our training pipeline leverages:
+- [**Dr. GRPO**](https://arxiv.org/abs/2503.20783) for stable and sample-efficient reinforcement learning.
+- **Synthetic multi-step MCP interactions** with strong tool chaining behavior, generated using our internal data engine.
+- **SGLang + VeRL** for efficient multi-turn rollout environments, built on top of Qwen3-4B for its function-calling capabilities.
+Through this training methodology, we observed a notable behavioral shift: the model **prefers invoking tools** when appropriate, instead of relying solely on pre-trained intuition — a key milestone for MCP-native agents.
+---
+### Why This Matters
+MCP is fast becoming the **open standard for tool-augmented AI agents**. However:
+- Most top-performing models (e.g., Claude 3.7 Sonnet, Gemini 2.5 Pro) are closed.
+- Tool sprawl across clients and servers creates complexity.
+- Open models often lack the training to effectively **use tools** at all.
+<!-- Osmosis-MCP-4B addresses all three — it’s small, powerful, and practical.
+INFO: Setting chat_template to " \n<|im_start|>user\n[content]<|im_end|>\n<|im_start|>assistant\n"
+INFO: Setting token_id of eos to 151645
+INFO: Setting token_id of pad to 151643
+INFO: Setting token_id of bos to 151643
+INFO: Setting add_bos_token to False -->
+---