| --- |
| license: mit |
| base_model: Jackrong/Qwopus3.5-27B-v3 |
| tags: |
| - security |
| - reasoning |
| - qwen3_5 |
| - distillation |
| - fine-tuned |
| language: |
| - en |
| pipeline_tag: text-generation |
| --- |
| |
| # Condor-27B |
|
|
| A security-reasoning fine-tune of [`Jackrong/Qwopus3.5-27B-v3`](https://huggingface.co/Jackrong/Qwopus3.5-27B-v3), distilled from Claude Opus reasoning traces on exploit development, vulnerability analysis, and defensive security topics. |
|
|
| ## Model Summary |
|
|
| - **Base model:** `Jackrong/Qwopus3.5-27B-v3` (27B, Qwen3.5 hybrid linear/full attention architecture) |
| - **Training type:** Full fine-tune (bf16, DeepSpeed ZeRO-3) |
| - **Focus:** Security reasoning β binary exploitation, web/app vulnerabilities, kernel/OS internals, cryptography, network attacks, defensive analysis |
| - **Intended use:** CTF assistance, security research, reading along with security books, pentesting thought-partner |
|
|
| ## Training |
|
|
| | | | |
| |---|---| |
| | Dataset size | 7,735 reasoning traces | |
| | Source prompts | 35+ security books (seed prompts per chapter) | |
| | Trace generator | Claude Opus (Anthropic API) | |
| | Steps | 1,395 | |
| | Wall time | 43h 43m | |
| | Hardware | 8Γ H100 (RunPod) | |
| | Precision | bf16 | |
| | Parallelism | DeepSpeed ZeRO-3 | |
| | Final eval loss | 0.99 | |
|
|
| The training data was generated by prompting Claude Opus with questions derived from security literature (books, papers, writeups) and capturing its full reasoning chain. No multi-turn dialogue β single-prompt reasoning traces only. |
|
|
| ## Serving |
|
|
| The model uses the same Qwen3.5-27B hybrid mamba architecture as the base, so any serving framework that supports that base works here. Tested with **sglang** on 2Γ A100 40GB: |
|
|
| ``` |
| python -m sglang.launch_server \ |
| --model-path dangell7/Condor-27B \ |
| --trust-remote-code \ |
| --tp-size 2 \ |
| --dtype bfloat16 \ |
| --context-length 8192 \ |
| --mem-fraction-static 0.85 \ |
| --kv-cache-dtype fp8_e5m2 \ |
| --port 30000 |
| ``` |
|
|
| Requires `transformers>=5.3.0` and sglang with PR [#21404](https://github.com/sgl-project/sglang/pull/21404) (merged 2026-03-30) β earlier versions leak mamba slots under concurrent load and deadlock the scheduler. |
|
|
| Observed decode throughput: **~38 tok/s** on 2Γ A100 40GB, tp=2, single client. |
|
|
| ### Known caveats |
|
|
| 1. **Chat template quirk (inherited from base):** Responses may emit a stray `</think>` closing tag without a matching opening tag. This is a pre-existing quirk of `Qwopus3.5-27B-v3` and not introduced by this fine-tune. Strip it in post-processing if it breaks your parser. |
| 2. **Longer outputs:** This fine-tune learned to produce denser, longer reasoning than the base (structured sections, code snippets, citations). Set `max_tokens` β₯ 4096 for complex prompts or expect truncation. |
| 3. **Tokenizer:** Native tokenizer is included (identical vocab to the Qwen3.5-27B base model; no new tokens were added during fine-tuning). Requires `transformers>=5.3.0` to load. |
| 4. **Concurrent serving:** sglang's hybrid mamba scheduler leaks mamba slots under 2+ concurrent requests in versions before PR [#21404](https://github.com/sgl-project/sglang/pull/21404) (merged 2026-03-30). Use sglang main post that commit, or serialize requests at a gateway for older versions. |
|
|
| ## Evaluation |
|
|
| Qualitative side-by-side vs base (`Jackrong/Qwopus3.5-27B-v3`) on 5 fixed prompts covering math, code debugging, systems reasoning, logic, and networking: |
|
|
| | Prompt | Base | Condor-27B | |
| |---|---|---| |
| | Multi-step math | Correct | Correct, headered sections + verification | |
| | Code bug hunt | Correct | Correct, more senior-voice (`itertools.accumulate` alternative) | |
| | GC vs manual vs ownership tradeoffs | Correct, textbook-shallow | Correct, dramatically deeper (G1/ZGC internals, code, fairness analysis) | |
| | Three-box logic puzzle | Correct | Correct, tighter deduction chain | |
| | TCP congestion control | Correct, Reno-focused | Correct, deeper (RFC citations, ASCII sawtooth, what-this-didn't-solve table) | |
|
|
| **Summary:** Correctness preserved across all 5 prompts with no regressions. Responses are markedly denser and more specific β more Opus-like in voice and structure. No repetition, mode collapse, or drift observed. |
|
|
| Full eval traces: see `eval/` (if published) or reproduce with the `vibe_client.py` harness. |
|
|
| ## Intended Use & Limitations |
|
|
| **Intended use:** |
| - Security research, CTF assistance, reading/learning alongside security literature |
| - Thought-partner for pentesting workflows with human oversight |
| - Reasoning-chain generation for further distillation |
|
|
| **Out of scope / don't use for:** |
| - Autonomous offensive security operations |
| - Targeting systems you don't own or have explicit authorization to test |
| - Factual lookup on specific CVEs, RFCs, or fast-moving details β verify independently (the model has been observed to confidently mis-cite RFC numbers) |
| - Non-English prompts (trained on English reasoning traces only) |
|
|
| ## Provenance |
|
|
| Distilled from Claude Opus outputs via the Anthropic API. Anthropic's terms of service allow using model outputs for your own purposes including training; downstream users of this model should read Anthropic's [usage policy](https://www.anthropic.com/legal/aup) and determine their own compliance obligations. |
|
|
| ## License |
|
|
| MIT (see LICENSE). The base model's license applies to its weights; this fine-tune's delta is released under MIT. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{condor-27b, |
| author = {Angell, Denis}, |
| title = {Condor-27B: A security-reasoning fine-tune of Qwopus3.5-27B-v3}, |
| year = {2026}, |
| url = {https://huggingface.co/dangell7/Condor-27B}, |
| } |
| ``` |
|
|