FRIDAY-35B
A reasoning-enhanced 35B parameter Mixture-of-Experts model fine-tuned for senior software engineering. Built on Qwen/Qwen3.6-35B-A3B (256 experts, 8 active per token, ~3B active parameters per forward pass).
FRIDAY reasons at a staff+ engineer level — architectural thinking, tradeoff analysis, and code review with root-cause depth.
What FRIDAY Does
- Code review: Identifies concurrency bugs, data consistency issues, and architectural anti-patterns
- System design: Diagnosis → root causes → short-term/long-term solutions
- Architectural reasoning: Evaluates tradeoffs rather than prescribing a single answer
- Multi-language: Rust, Python, TypeScript, C++, Go, Java
Eval
Buggy async Python checkout service with 10 planted bugs:
| FRIDAY-35B | Competitor (API) | |
|---|---|---|
| Bugs found | 10/10 | 7/10 |
| Time | 19.5s | 53.2s |
| Tokens out | 3,156 | 4,226 |
| Throughput | ~162 tok/s | ~79 tok/s |
FRIDAY found all 10 bugs across both runs. The competitor missed 3: lock TTL expiration during slow payments, null product row dereference, and Redis type mismatch on lpush. FRIDAY also flagged the Redis distributed lock as architecturally redundant given proper DB-level locking.
Training
| Base model | Qwen/Qwen3.6-35B-A3B |
| Architecture | MoE — 256 experts, 8 active/token, GDN hybrid attention |
| Method | Full fine-tune SFT |
| Training data | 2,472 reasoning traces |
| Sequence length | 8,192 tokens |
| Epochs | 3 |
| Learning rate | 2e-5, cosine schedule |
| Precision | BF16 + TF32 |
| Framework | TRL SFTTrainer + DeepSpeed ZeRO-3 |
| Hardware | 8× A100 80GB |
Usage
With SGLang
python -m sglang.launch_server \
--model dangell7/Friday-35B \
--dtype bfloat16 \
--tp 8 \
--trust-remote-code
With Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"dangell7/Friday-35B",
torch_dtype="bfloat16",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("dangell7/Friday-35B")
Limitations
- Autoregressive LLM; may hallucinate technical details
- MoE architecture requires significant VRAM (~8× A100 or equivalent)
- Not a substitute for human code review in production systems
Acknowledgements
- Qwen team for Qwen3.6-35B-A3B
- SGLang for high-performance MoE serving
- TRL and DeepSpeed for training infrastructure
Citation
@misc{Friday_35B,
title = {FRIDAY-35B},
author = {dangell7},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/dangell7/Friday-35B}}
}
- Downloads last month
- 145
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support