Trilogix1
/

Hugston-forestliutcMarsRL-f16

GGUF

conversational

Model card Files Files and versions

xet

Community

Trilogix1 commited on 21 days ago

Commit

cbdee84

verified ·

1 Parent(s): 90bf937

Update README.md

Browse files

Files changed (1) hide show

README.md +44 -42

README.md CHANGED Viewed

@@ -1,9 +1,8 @@
----
-license: mit
 base_model:
 - Qwen/Qwen3-30B-A3B-Thinking-2507
 ---
-<div align="center">
 #  MarsRL
 <div>
@@ -12,53 +11,56 @@ base_model:
 <a href="https://arxiv.org/pdf/2511.11373" target="_blank">Paper</a> | <a href="https://github.com/liushulinle/MarsRL" target="_blank">GitHub</a>
 </div>
-## Overview
-<hr />
-Recent progress in large language models (LLMs) has been propelled by reinforcement learning with verifiable rewards (RLVR) and test-time scaling. However, the limited output length of LLMs constrains the depth of reasoning attainable in a single inference process. Multi-agent reasoning systems offer a promising alternative by employing multiple agents including Solver, Verifier, and Corrector, to iteratively refine solutions. While effective in closed-source models like Gemini 2.5 Pro, they struggle to generalize to open-source models due to insufficient critic and correction capabilities. To address this, we propose MarsRL, a novel reinforcement learning framework with agentic pipeline parallelism, designed to jointly optimize all agents in the system. MarsRL introduces agent-specific reward mechanisms to mitigate reward noise and employs pipeline-inspired training to enhance efficiency in handling long trajectories. Applied to Qwen3-30B-A3B-Thinking-2507, MarsRL improves AIME2025 accuracy from 86.5\% to 93.3\% and BeyondAIME from 64.9\% to 73.8\%, even surpassing Qwen3-235B-A22B-Thinking-2507. These findings highlight the potential of MarsRL to advance multi-agent reasoning systems and broaden their applicability across diverse reasoning tasks.
-<div align="center">
-<img src="home.jpg" width="80%" />
-</div>
-## V-C Reasoning System Evaluation Instructions
-<hr />
-### step1: Download our released model or other open source models
-Supported models: Qwen3/DeepSeekV3.1/DeepSeek R1. You can modify the llm_client.py to use other models.
-### step2: Deploy service via VLLM
-### step3: Run the V-C reasoning system by the following commands:
-```
-python3 vc_reasoning_system.py solver_ip_port_1,solver_ip_port_2,... vc_ip_port_1,vc_ip_port_2,... test_file output_dir
-for example: python3 vc_reasoning_system.py 8.8.8.8:8021,12.34.56.78:8021 8.8.8.8:8021,12.34.56.78:8021 ./outputs/debug ./test_corpus/aime2025.jsonl
-```
-This step will run the reasoning system for each problem in the given $test_file$, the predicted results can be found in the output_dir
-### step4: Extract final solutions by the following commands:
-```
-python3 extract_solution.py result_dir test_file
-for example: python3 extract_solution.py ./outputs/debug ./test_corpus/aime_2025.jsonl
-```
-This step will generate a file named "eval_overalljsonl" in the input_dir. Your can evaluate the metrics based on this file.
-## Acknowledgements
-<hr />
-- Our implementation is heaviliy built on [verl](https://github.com/volcengine/verl).
-- Our models are trained on top of [Qwen3-30B-A3B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507).
-- Our V-C Reasoning system is built on [IMO25 pipline](https://github.com/lyang36/IMO25).
-Thanks for their wonderful work.
-## Citation
-<hr />
-```bibtex
-@article{Marsrl2025,
-    title = {MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism},
-    author = {Shulin Liu, Dong Du, Tao Yang, Yang Li, Boyu Qiu}
-    year = {2025}
-}
-```

 base_model:
 - Qwen/Qwen3-30B-A3B-Thinking-2507
 ---
 #  MarsRL
 <div>
 <a href="https://arxiv.org/pdf/2511.11373" target="_blank">Paper</a> | <a href="https://github.com/liushulinle/MarsRL" target="_blank">GitHub</a>
 </div>
+Trilogix1/Hugston-forestliutcMarsRL-f16 pipeline_tag: text-generation tags:
+# Thinking
+# Coder
+# Hugston
+# Trilogix1/Hugston-forestliutcMarsRL-f16
+---
+# Original weights at: https://huggingface.co/forestliutc/MarsRL
+This is an converted and quantized version by Hugston Team created with Quanta (see Github to know more about it).
+This is a crude, proof-of-concept implementation to convert and quantize a .safetensor llm model in GGUF.
+![Screenshot 2025-11-21 114116](https://cdn-uploads.huggingface.co/production/uploads/6818be9259cb758d06603579/UtgEP2NnXMLEV7rYEYoDw.png)
+Quantization was performed using an automatic and faster method, which leads to less time and faster results.
+This model was made possible by: https://Hugston.com
+You can use the model with HugstonOne Enterprise Edition
+Tested and ecountered small precision errors in coding tasks but the model is quite impressive for the size.
+We see the model fit for non precision tasks (like game vibecoding coding and general tasks).
+![Screenshot 2025-11-22 133409](https://cdn-uploads.huggingface.co/production/uploads/6818be9259cb758d06603579/_sEqG886fGo6qv1y0dY7B.png)
+---
+Watch HugstonOne coding and preview in action:
+---
+https://vimeo.com/1121493834?share=copy&fl=sv&fe=ci
+Usage
+---
+-Download App HugstonOne at Hugston.com or at https://github.com/Mainframework
+---
+-Download model from https://hugston.com/explore?folder=llm_models or Huggingface
+---
+-If you already have the Llm Model downloaded chose it by clicking pick model in HugstonOne -Then click Load model in Cli or Server
+---
+-For multimodal use you need a VL/multimodal LLM model with the Mmproj file in the same folder. -Select model and select mmproj.
+---
+-Note: if the mmproj is inside the same folder with other models non multimodal, the non model will not load unless the mmproj is moved from folder.