Instructions to use samuelcardillo/Carnice-MoE-35B-A3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- HERMES
How to use samuelcardillo/Carnice-MoE-35B-A3B with HERMES:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio
How to use samuelcardillo/Carnice-MoE-35B-A3B with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for samuelcardillo/Carnice-MoE-35B-A3B to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for samuelcardillo/Carnice-MoE-35B-A3B to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for samuelcardillo/Carnice-MoE-35B-A3B to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="samuelcardillo/Carnice-MoE-35B-A3B", max_seq_length=2048, )
Carnice MoE 35B-A3B — Hermes-Focused Agentic Model
QLoRA fine-tune of Qwen3.5-35B-A3B (MoE, 3B active parameters) optimized for agentic workflows and Hermes Agent runtime. Two-stage training adapted from kai-os/Carnice-9b.
Credits
Training methodology adapted from kai-os/Carnice-9b — same two-stage approach and datasets, applied to the larger MoE architecture. Key inspiration: training on actual Hermes Agent execution traces for native agentic behavior.
For GGUF quantizations (Q4, Q5, Q6, Q8, MXFP4), see samuelcardillo/Carnice-MoE-35B-A3B-GGUF.
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen/Qwen3.5-35B-A3B |
| Architecture | Mixture of Experts (MoE) |
| Total Parameters | ~35B |
| Active Parameters | ~3B per token |
What Makes This Different
Unlike generic reasoning distillation, this model was trained on actual Hermes Agent execution traces — real conversations where an AI agent:
- Executes terminal commands and processes output
- Performs file editing operations
- Chains multi-step tool calls with results feeding back
- Uses browser-assisted workflows
- Makes decisions based on environmental feedback
This teaches the model the exact conversation patterns Hermes expects, rather than just generic reasoning.
Training Details
Two-Stage Approach
Stage A — Reasoning Repair (1 epoch)
- Strengthens base model reasoning before agent-specific training
- Loss: 0.4159
| Dataset | Examples |
|---|---|
| bespokelabs/Bespoke-Stratos-17k | 16,710 |
| AI-MO/NuminaMath-CoT | 17,000 (capped) |
Stage B — Hermes Traces (2 epochs)
- Agent-specific behavioral training on real execution traces
- Loss: 0.3115
| Dataset | Examples |
|---|---|
| kai-os/carnice-glm5-hermes-traces | 1,627 (high quality) |
| open-thoughts/OpenThoughts-Agent-v1-SFT | 15,209 |
Training Configuration
| Parameter | Stage A | Stage B |
|---|---|---|
| LoRA Rank | 64 | 64 |
| LoRA Alpha | 64 | 64 |
| LoRA Targets | q, k, v, o projections | q, k, v, o projections |
| Learning Rate | 2e-5 (linear) | 1e-5 (cosine) |
| Epochs | 1 | 2 |
| Effective Batch | 12 | 12 |
| Context Length | 4096 | 4096 |
| Precision | 4-bit QLoRA + BF16 adapters | Same |
| GPU | RTX PRO 6000 Blackwell (96GB) | Same |
| Total Training Time | ~44 hours (both stages) |
Trainable Parameters
6,881,280 (0.02% of 35B total)
Usage with llama.cpp
llama-server \
--model Carnice-MoE-35B-A3B-Q8_0.gguf \
--n-gpu-layers -1 \
--ctx-size 131072 \
--host 0.0.0.0 --port 8082
Acknowledgements
- kai-os — Carnice training methodology and Hermes traces dataset
- open-thoughts — Agent SFT dataset
- bespokelabs — Bespoke-Stratos reasoning dataset
- Unsloth — QLoRA training framework
- Qwen — Base model
- Downloads last month
- -