Instructions to use MainStack/marvy-1-14B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MainStack/marvy-1-14B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MainStack/marvy-1-14B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("MainStack/marvy-1-14B") model = AutoModelForCausalLM.from_pretrained("MainStack/marvy-1-14B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - MLX
How to use MainStack/marvy-1-14B with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("MainStack/marvy-1-14B") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- vLLM
How to use MainStack/marvy-1-14B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MainStack/marvy-1-14B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MainStack/marvy-1-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MainStack/marvy-1-14B
- SGLang
How to use MainStack/marvy-1-14B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MainStack/marvy-1-14B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MainStack/marvy-1-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MainStack/marvy-1-14B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MainStack/marvy-1-14B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Pi
How to use MainStack/marvy-1-14B with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "MainStack/marvy-1-14B"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "MainStack/marvy-1-14B" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use MainStack/marvy-1-14B with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "MainStack/marvy-1-14B"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default MainStack/marvy-1-14B
Run Hermes
hermes
- MLX LM
How to use MainStack/marvy-1-14B with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "MainStack/marvy-1-14B"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "MainStack/marvy-1-14B" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MainStack/marvy-1-14B", "messages": [ {"role": "user", "content": "Hello"} ] }' - Docker Model Runner
How to use MainStack/marvy-1-14B with Docker Model Runner:
docker model run hf.co/MainStack/marvy-1-14B
Licensing — marvy-1-14B
marvy-1-14B uses a layered (dual) license that reflects what is built on top of an upstream open model versus what MainStack authored.
| Component | License | What it covers |
|---|---|---|
Model weights (*.safetensors, GGUF quants, LoRA adapter) |
Apache-2.0 | The fine-tuned weights. These are a derivative of Qwen2.5-14B-Instruct (Apache-2.0); per that license they remain Apache-2.0 and free to use, modify, and redistribute. |
| MainStack original contributions | CC-BY-4.0 | The model cards, documentation (USAGE.md, VALIDATION.md, benchmark), the benchmark charts, the curated training-data methodology, and the pipeline framing authored by MainStack. |
What this means in practice
You may (under Apache-2.0, for the weights)
- Use marvy-1-14B commercially, privately, or in research.
- Fine-tune, distill, quantize, merge, or otherwise build on the weights.
- Redistribute the weights, including modified versions.
…provided you retain the NOTICE file in derivatives and redistributions
(Apache-2.0 §4(d) — this is mandatory and carries the attribution request).
You must (under CC-BY-4.0, for our contributions)
If you reuse MainStack's documentation, model cards, benchmark, or charts — e.g. copying our eval methodology, reproducing our charts, or lifting card text into your own model — you must give attribution: credit "MainStack" and link to https://huggingface.co/MainStack/marvy-1-14B. This is a binding condition of CC-BY-4.0, not just a request.
We ask (attribution for the model)
If you use marvy-1-14B as a baseline, a starting point for your own fine-tune,
a distillation source, or an evaluation comparison, please credit MainStack
and cite the entry in the model card. See NOTICE and the card's Citation
section.
Why the weights can't be more restricted
Qwen2.5-14B-Instruct is released under Apache-2.0, which grants every recipient an irrevocable, royalty-free right to use and redistribute. A fine-tune cannot revoke those rights on the resulting weights. MainStack's protection therefore lives where it legally can: (1) the CC-BY-4.0 license on our own authored materials, and (2) the Apache-2.0 NOTICE that must travel with the weights.
Files
LICENSE— Apache-2.0 (governs the weights; inherited from the base model).LICENSE-CC-BY-4.0— CC-BY-4.0 (governs MainStack's documentation and other original contributions).NOTICE— required attribution notices (retain in derivatives).CITATION.cff— citation metadata.