Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper • 2411.14405 • Published • 61
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf cortexso/marco-o1:# Run inference directly in the terminal:
llama-cli -hf cortexso/marco-o1:# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf cortexso/marco-o1:# Run inference directly in the terminal:
./llama-cli -hf cortexso/marco-o1:git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf cortexso/marco-o1:# Run inference directly in the terminal:
./build/bin/llama-cli -hf cortexso/marco-o1:docker model run hf.co/cortexso/marco-o1:Marco-o1 not only focuses on disciplines with standard answers, such as mathematics, physics, and coding—which are well-suited for reinforcement learning (RL)—but also places greater emphasis on open-ended resolutions. We aim to address the question: "Can the o1 model effectively generalize to broader domains where clear standards are absent and rewards are challenging to quantify?"
Currently, Marco-o1 Large Language Model (LLM) is powered by Chain-of-Thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), reflection mechanisms, and innovative reasoning strategies—optimized for complex real-world problem-solving tasks.
| No | Variant | Cortex CLI command |
|---|---|---|
| 1 | Marco-o1-8b | cortex run marco-o1:8b |
cortexhub/marco-o1
cortex run marco-o1
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf cortexso/marco-o1:# Run inference directly in the terminal: llama-cli -hf cortexso/marco-o1: