File size: 6,132 Bytes
78803eb 31a152f ca999bb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 78803eb 31a152f 21ef95f 3eb55af | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | ---
license: apache-2.0
language:
- en
tags:
- tactical-reasoning
- military
- defense-ai
- bicell-dispersal
- sft
- dual-perspective
- shepherd
- convergentintel
- qwen
- ai
base_model: Qwen/Qwen3-1.7B
datasets:
- ZennyKenny/tactical-military-reasoning-v.1.0
library_name: transformers
pipeline_tag: text-generation
---
# Shepherd-Alpha
**The first defense AI reasoning model on Hugging Face.**
Shepherd-Alpha is a tactical reasoning model fine-tuned on dual-perspective military scenario analysis using BiCell Depth Dispersal β a novel training methodology that partitions transformer layers by abstraction depth and trains them asymmetrically to separate representation encoding from task-specific reasoning.
Developed by [Convergent Intelligence LLC: Research Division](https://convergentintel.com)
## What This Model Does
Given a tactical scenario, Shepherd-Alpha produces structured dual-perspective analysis:
- **Attack reasoning** β how an adversary would exploit the situation
- **Defense reasoning** β how to counter, mitigate, and survive
The model is trained to think like both attacker and defender simultaneously. A model that understands how to attack becomes a defender that anticipates.
## Training Methodology: BiCell Depth Dispersal
Standard fine-tuning updates all layers jointly, allowing co-adaptation that can mask shallow learning. BiCell Depth Dispersal forces genuine specialization:
| Phase | Frozen | Training | Purpose |
|-------|--------|----------|---------|
| 1 | Upper layers (14-27) | Lower layers (0-13) | Foundations encode before specialization exists |
| 2 | Lower layers (0-13) | Upper layers (14-27) | Reasoning learns over frozen representations |
| 3 | None | All layers | Joint integration of asymmetric gradient history |
All three backward passes accumulate gradients before a single optimizer step. The asymmetric gradient history forces each depth zone to develop independently before integration.
**Key finding during training:** Lower layers consistently produce ~1.7x the gradient magnitude of upper layers during domain adaptation. The pretrained upper layers already possess sufficient reasoning capacity β the primary adaptation is teaching lower layers to encode tactical domain structure. This suggests that for domain-specific SFT, representation layers (not reasoning layers) are the bottleneck.
### Training Details
- **Base model:** Qwen/Qwen3-1.7B (28 layers, all full attention)
- **Dataset:** [ZennyKenny/tactical-military-reasoning-v.1.0](https://huggingface.co/datasets/ZennyKenny/tactical-military-reasoning-v.1.0) β 150 dual-perspective tactical scenarios with attack and defense chain-of-thought reasoning (MIT licensed)
- **Architecture:** 28 transformer layers split at depth 14 β Zone Lo (layers 0-13) and Zone Hi (layers 14-27)
- **Hardware:** NVIDIA A100
- **Epochs:** 3
- **Batch size:** 2
- **Learning rate:** 2e-5 (AdamW, weight decay 0.01)
- **Precision:** bfloat16
- **Label masking:** Loss computed only on assistant (reasoning) tokens, not scenario prompts
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("reaperdoesntknow/Shepherd-Alpha")
tokenizer = AutoTokenizer.from_pretrained("reaperdoesntknow/Shepherd-Alpha")
messages = [
{
"role": "user",
"content": "Analyze this tactical scenario.\n\nScenario: A mechanized platoon advancing through urban terrain detects a coordinated drone swarm from the northeast. Limited anti-air capability. Civilian structures restrict fields of fire."
}
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
)
output = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
)
generated = output[0][inputs["input_ids"].shape[1]:]
print(tokenizer.decode(generated, skip_special_tokens=True))
```
## The Shepherd Program
Shepherd-Alpha is the first public model in the Shepherd family β an ongoing research program developing AI systems for autonomous defense applications. The program spans:
- **Shepherd Doctrine** β a comprehensive counter-swarm and area defense blueprint covering 28+ subsystems across five concentric engagement layers
- **Shepherd AI** β tactical reasoning models trained on dual-perspective analysis (this model)
- **BiCell Dispersal** β a training methodology based on the B_i Cell Dispersal framework for stochastic layer partitioning during fine-tuning
## Limitations
- **Alpha release** β this is a research checkpoint, not a production system
- **Small training set** β 150 scenarios provides format and domain grounding but limited tactical depth. Future versions will incorporate augmented datasets with multi-model generated reasoning
- **Base model thinking mode** β Qwen3's pretrained `<think>` generation pattern can override the structured output format. Use `enable_thinking=False` in generation config for cleaner output
- **Not a weapon system** β this model performs analysis and reasoning. It does not control, target, or actuate anything
## Citation
```bibtex
@misc{shepherd-alpha-2026,
title={Shepherd-Alpha: Tactical Reasoning via BiCell Depth Dispersal},
author={Convergent Intelligence LLC},
year={2026},
url={https://huggingface.co/reaperdoesntknow/Shepherd-Alpha}
}
```
## Related Work
- [Structure Over Scale](https://doi.org/10.57967/hf/5165) β Foundation paper on structure-first training methodologies
- [DualMind Methodology](https://doi.org/10.57967/hf/5184) β Dual-cognitive-mode SFT using EXPLORE/EXAMINE tokens
- [Discrepancy Calculus](https://doi.org/10.57967/hf/5194) β Mathematical framework grounding BiCell dispersal theory
- [B_i Cell Dispersal Framework](https://convergentintel.com) β Stochastic layer freezing grounded in DISC measure theory
---
*Convergent Intelligence LLC: Research Division*
*"Structure beats scale. Collaboration beats hierarchy. Observation beats theory."*
<!-- cix-keeper-ts:2026-04-11T16:09:32Z -->
|