Condor-27B / README.md
dangell7's picture
Initial upload: fine-tune weights, config, tokenizer, model card
544dcf2 verified
---
license: mit
base_model: Jackrong/Qwopus3.5-27B-v3
tags:
- security
- reasoning
- qwen3_5
- distillation
- fine-tuned
language:
- en
pipeline_tag: text-generation
---
# Condor-27B
A security-reasoning fine-tune of [`Jackrong/Qwopus3.5-27B-v3`](https://huggingface.co/Jackrong/Qwopus3.5-27B-v3), distilled from Claude Opus reasoning traces on exploit development, vulnerability analysis, and defensive security topics.
## Model Summary
- **Base model:** `Jackrong/Qwopus3.5-27B-v3` (27B, Qwen3.5 hybrid linear/full attention architecture)
- **Training type:** Full fine-tune (bf16, DeepSpeed ZeRO-3)
- **Focus:** Security reasoning β€” binary exploitation, web/app vulnerabilities, kernel/OS internals, cryptography, network attacks, defensive analysis
- **Intended use:** CTF assistance, security research, reading along with security books, pentesting thought-partner
## Training
| | |
|---|---|
| Dataset size | 7,735 reasoning traces |
| Source prompts | 35+ security books (seed prompts per chapter) |
| Trace generator | Claude Opus (Anthropic API) |
| Steps | 1,395 |
| Wall time | 43h 43m |
| Hardware | 8Γ— H100 (RunPod) |
| Precision | bf16 |
| Parallelism | DeepSpeed ZeRO-3 |
| Final eval loss | 0.99 |
The training data was generated by prompting Claude Opus with questions derived from security literature (books, papers, writeups) and capturing its full reasoning chain. No multi-turn dialogue β€” single-prompt reasoning traces only.
## Serving
The model uses the same Qwen3.5-27B hybrid mamba architecture as the base, so any serving framework that supports that base works here. Tested with **sglang** on 2Γ— A100 40GB:
```
python -m sglang.launch_server \
--model-path dangell7/Condor-27B \
--trust-remote-code \
--tp-size 2 \
--dtype bfloat16 \
--context-length 8192 \
--mem-fraction-static 0.85 \
--kv-cache-dtype fp8_e5m2 \
--port 30000
```
Requires `transformers>=5.3.0` and sglang with PR [#21404](https://github.com/sgl-project/sglang/pull/21404) (merged 2026-03-30) β€” earlier versions leak mamba slots under concurrent load and deadlock the scheduler.
Observed decode throughput: **~38 tok/s** on 2Γ— A100 40GB, tp=2, single client.
### Known caveats
1. **Chat template quirk (inherited from base):** Responses may emit a stray `</think>` closing tag without a matching opening tag. This is a pre-existing quirk of `Qwopus3.5-27B-v3` and not introduced by this fine-tune. Strip it in post-processing if it breaks your parser.
2. **Longer outputs:** This fine-tune learned to produce denser, longer reasoning than the base (structured sections, code snippets, citations). Set `max_tokens` β‰₯ 4096 for complex prompts or expect truncation.
3. **Tokenizer:** Native tokenizer is included (identical vocab to the Qwen3.5-27B base model; no new tokens were added during fine-tuning). Requires `transformers>=5.3.0` to load.
4. **Concurrent serving:** sglang's hybrid mamba scheduler leaks mamba slots under 2+ concurrent requests in versions before PR [#21404](https://github.com/sgl-project/sglang/pull/21404) (merged 2026-03-30). Use sglang main post that commit, or serialize requests at a gateway for older versions.
## Evaluation
Qualitative side-by-side vs base (`Jackrong/Qwopus3.5-27B-v3`) on 5 fixed prompts covering math, code debugging, systems reasoning, logic, and networking:
| Prompt | Base | Condor-27B |
|---|---|---|
| Multi-step math | Correct | Correct, headered sections + verification |
| Code bug hunt | Correct | Correct, more senior-voice (`itertools.accumulate` alternative) |
| GC vs manual vs ownership tradeoffs | Correct, textbook-shallow | Correct, dramatically deeper (G1/ZGC internals, code, fairness analysis) |
| Three-box logic puzzle | Correct | Correct, tighter deduction chain |
| TCP congestion control | Correct, Reno-focused | Correct, deeper (RFC citations, ASCII sawtooth, what-this-didn't-solve table) |
**Summary:** Correctness preserved across all 5 prompts with no regressions. Responses are markedly denser and more specific β€” more Opus-like in voice and structure. No repetition, mode collapse, or drift observed.
Full eval traces: see `eval/` (if published) or reproduce with the `vibe_client.py` harness.
## Intended Use & Limitations
**Intended use:**
- Security research, CTF assistance, reading/learning alongside security literature
- Thought-partner for pentesting workflows with human oversight
- Reasoning-chain generation for further distillation
**Out of scope / don't use for:**
- Autonomous offensive security operations
- Targeting systems you don't own or have explicit authorization to test
- Factual lookup on specific CVEs, RFCs, or fast-moving details β€” verify independently (the model has been observed to confidently mis-cite RFC numbers)
- Non-English prompts (trained on English reasoning traces only)
## Provenance
Distilled from Claude Opus outputs via the Anthropic API. Anthropic's terms of service allow using model outputs for your own purposes including training; downstream users of this model should read Anthropic's [usage policy](https://www.anthropic.com/legal/aup) and determine their own compliance obligations.
## License
MIT (see LICENSE). The base model's license applies to its weights; this fine-tune's delta is released under MIT.
## Citation
```bibtex
@misc{condor-27b,
author = {Angell, Denis},
title = {Condor-27B: A security-reasoning fine-tune of Qwopus3.5-27B-v3},
year = {2026},
url = {https://huggingface.co/dangell7/Condor-27B},
}
```