File size: 5,538 Bytes
544dcf2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
license: mit
base_model: Jackrong/Qwopus3.5-27B-v3
tags:
  - security
  - reasoning
  - qwen3_5
  - distillation
  - fine-tuned
language:
  - en
pipeline_tag: text-generation
---

# Condor-27B

A security-reasoning fine-tune of [`Jackrong/Qwopus3.5-27B-v3`](https://huggingface.co/Jackrong/Qwopus3.5-27B-v3), distilled from Claude Opus reasoning traces on exploit development, vulnerability analysis, and defensive security topics.

## Model Summary

- **Base model:** `Jackrong/Qwopus3.5-27B-v3` (27B, Qwen3.5 hybrid linear/full attention architecture)
- **Training type:** Full fine-tune (bf16, DeepSpeed ZeRO-3)
- **Focus:** Security reasoning — binary exploitation, web/app vulnerabilities, kernel/OS internals, cryptography, network attacks, defensive analysis
- **Intended use:** CTF assistance, security research, reading along with security books, pentesting thought-partner

## Training

| | |
|---|---|
| Dataset size | 7,735 reasoning traces |
| Source prompts | 35+ security books (seed prompts per chapter) |
| Trace generator | Claude Opus (Anthropic API) |
| Steps | 1,395 |
| Wall time | 43h 43m |
| Hardware | 8× H100 (RunPod) |
| Precision | bf16 |
| Parallelism | DeepSpeed ZeRO-3 |
| Final eval loss | 0.99 |

The training data was generated by prompting Claude Opus with questions derived from security literature (books, papers, writeups) and capturing its full reasoning chain. No multi-turn dialogue — single-prompt reasoning traces only.

## Serving

The model uses the same Qwen3.5-27B hybrid mamba architecture as the base, so any serving framework that supports that base works here. Tested with **sglang** on 2× A100 40GB:

```
python -m sglang.launch_server \
  --model-path dangell7/Condor-27B \
  --trust-remote-code \
  --tp-size 2 \
  --dtype bfloat16 \
  --context-length 8192 \
  --mem-fraction-static 0.85 \
  --kv-cache-dtype fp8_e5m2 \
  --port 30000
```

Requires `transformers>=5.3.0` and sglang with PR [#21404](https://github.com/sgl-project/sglang/pull/21404) (merged 2026-03-30) — earlier versions leak mamba slots under concurrent load and deadlock the scheduler.

Observed decode throughput: **~38 tok/s** on 2× A100 40GB, tp=2, single client.

### Known caveats

1. **Chat template quirk (inherited from base):** Responses may emit a stray `</think>` closing tag without a matching opening tag. This is a pre-existing quirk of `Qwopus3.5-27B-v3` and not introduced by this fine-tune. Strip it in post-processing if it breaks your parser.
2. **Longer outputs:** This fine-tune learned to produce denser, longer reasoning than the base (structured sections, code snippets, citations). Set `max_tokens` ≥ 4096 for complex prompts or expect truncation.
3. **Tokenizer:** Native tokenizer is included (identical vocab to the Qwen3.5-27B base model; no new tokens were added during fine-tuning). Requires `transformers>=5.3.0` to load.
4. **Concurrent serving:** sglang's hybrid mamba scheduler leaks mamba slots under 2+ concurrent requests in versions before PR [#21404](https://github.com/sgl-project/sglang/pull/21404) (merged 2026-03-30). Use sglang main post that commit, or serialize requests at a gateway for older versions.

## Evaluation

Qualitative side-by-side vs base (`Jackrong/Qwopus3.5-27B-v3`) on 5 fixed prompts covering math, code debugging, systems reasoning, logic, and networking:

| Prompt | Base | Condor-27B |
|---|---|---|
| Multi-step math | Correct | Correct, headered sections + verification |
| Code bug hunt | Correct | Correct, more senior-voice (`itertools.accumulate` alternative) |
| GC vs manual vs ownership tradeoffs | Correct, textbook-shallow | Correct, dramatically deeper (G1/ZGC internals, code, fairness analysis) |
| Three-box logic puzzle | Correct | Correct, tighter deduction chain |
| TCP congestion control | Correct, Reno-focused | Correct, deeper (RFC citations, ASCII sawtooth, what-this-didn't-solve table) |

**Summary:** Correctness preserved across all 5 prompts with no regressions. Responses are markedly denser and more specific — more Opus-like in voice and structure. No repetition, mode collapse, or drift observed.

Full eval traces: see `eval/` (if published) or reproduce with the `vibe_client.py` harness.

## Intended Use & Limitations

**Intended use:**
- Security research, CTF assistance, reading/learning alongside security literature
- Thought-partner for pentesting workflows with human oversight
- Reasoning-chain generation for further distillation

**Out of scope / don't use for:**
- Autonomous offensive security operations
- Targeting systems you don't own or have explicit authorization to test
- Factual lookup on specific CVEs, RFCs, or fast-moving details — verify independently (the model has been observed to confidently mis-cite RFC numbers)
- Non-English prompts (trained on English reasoning traces only)

## Provenance

Distilled from Claude Opus outputs via the Anthropic API. Anthropic's terms of service allow using model outputs for your own purposes including training; downstream users of this model should read Anthropic's [usage policy](https://www.anthropic.com/legal/aup) and determine their own compliance obligations.

## License

MIT (see LICENSE). The base model's license applies to its weights; this fine-tune's delta is released under MIT.

## Citation

```bibtex
@misc{condor-27b,
  author = {Angell, Denis},
  title  = {Condor-27B: A security-reasoning fine-tune of Qwopus3.5-27B-v3},
  year   = {2026},
  url    = {https://huggingface.co/dangell7/Condor-27B},
}
```