| --- |
| license: apache-2.0 |
| language: |
| - en |
| tags: |
| - tactical-reasoning |
| - military |
| - defense-ai |
| - bicell-dispersal |
| - sft |
| - dual-perspective |
| - shepherd |
| - convergentintel |
| - qwen |
| - ai |
| base_model: Qwen/Qwen3-1.7B |
| datasets: |
| - ZennyKenny/tactical-military-reasoning-v.1.0 |
| library_name: transformers |
| pipeline_tag: text-generation |
| --- |
| |
| # Shepherd-Alpha |
|
|
| **The first defense AI reasoning model on Hugging Face.** |
|
|
| Shepherd-Alpha is a tactical reasoning model fine-tuned on dual-perspective military scenario analysis using BiCell Depth Dispersal β a novel training methodology that partitions transformer layers by abstraction depth and trains them asymmetrically to separate representation encoding from task-specific reasoning. |
|
|
| Developed by [Convergent Intelligence LLC: Research Division](https://convergentintel.com) |
|
|
| ## What This Model Does |
|
|
| Given a tactical scenario, Shepherd-Alpha produces structured dual-perspective analysis: |
| - **Attack reasoning** β how an adversary would exploit the situation |
| - **Defense reasoning** β how to counter, mitigate, and survive |
|
|
| The model is trained to think like both attacker and defender simultaneously. A model that understands how to attack becomes a defender that anticipates. |
|
|
| ## Training Methodology: BiCell Depth Dispersal |
|
|
| Standard fine-tuning updates all layers jointly, allowing co-adaptation that can mask shallow learning. BiCell Depth Dispersal forces genuine specialization: |
|
|
| | Phase | Frozen | Training | Purpose | |
| |-------|--------|----------|---------| |
| | 1 | Upper layers (14-27) | Lower layers (0-13) | Foundations encode before specialization exists | |
| | 2 | Lower layers (0-13) | Upper layers (14-27) | Reasoning learns over frozen representations | |
| | 3 | None | All layers | Joint integration of asymmetric gradient history | |
|
|
| All three backward passes accumulate gradients before a single optimizer step. The asymmetric gradient history forces each depth zone to develop independently before integration. |
|
|
| **Key finding during training:** Lower layers consistently produce ~1.7x the gradient magnitude of upper layers during domain adaptation. The pretrained upper layers already possess sufficient reasoning capacity β the primary adaptation is teaching lower layers to encode tactical domain structure. This suggests that for domain-specific SFT, representation layers (not reasoning layers) are the bottleneck. |
|
|
| ### Training Details |
|
|
| - **Base model:** Qwen/Qwen3-1.7B (28 layers, all full attention) |
| - **Dataset:** [ZennyKenny/tactical-military-reasoning-v.1.0](https://huggingface.co/datasets/ZennyKenny/tactical-military-reasoning-v.1.0) β 150 dual-perspective tactical scenarios with attack and defense chain-of-thought reasoning (MIT licensed) |
| - **Architecture:** 28 transformer layers split at depth 14 β Zone Lo (layers 0-13) and Zone Hi (layers 14-27) |
| - **Hardware:** NVIDIA A100 |
| - **Epochs:** 3 |
| - **Batch size:** 2 |
| - **Learning rate:** 2e-5 (AdamW, weight decay 0.01) |
| - **Precision:** bfloat16 |
| - **Label masking:** Loss computed only on assistant (reasoning) tokens, not scenario prompts |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained("reaperdoesntknow/Shepherd-Alpha") |
| tokenizer = AutoTokenizer.from_pretrained("reaperdoesntknow/Shepherd-Alpha") |
| |
| messages = [ |
| { |
| "role": "user", |
| "content": "Analyze this tactical scenario.\n\nScenario: A mechanized platoon advancing through urban terrain detects a coordinated drone swarm from the northeast. Limited anti-air capability. Civilian structures restrict fields of fire." |
| } |
| ] |
| |
| inputs = tokenizer.apply_chat_template( |
| messages, |
| add_generation_prompt=True, |
| tokenize=True, |
| return_dict=True, |
| return_tensors="pt", |
| ) |
| |
| output = model.generate( |
| **inputs, |
| max_new_tokens=512, |
| temperature=0.7, |
| top_p=0.9, |
| do_sample=True, |
| ) |
| |
| generated = output[0][inputs["input_ids"].shape[1]:] |
| print(tokenizer.decode(generated, skip_special_tokens=True)) |
| ``` |
|
|
| ## The Shepherd Program |
|
|
| Shepherd-Alpha is the first public model in the Shepherd family β an ongoing research program developing AI systems for autonomous defense applications. The program spans: |
|
|
| - **Shepherd Doctrine** β a comprehensive counter-swarm and area defense blueprint covering 28+ subsystems across five concentric engagement layers |
| - **Shepherd AI** β tactical reasoning models trained on dual-perspective analysis (this model) |
| - **BiCell Dispersal** β a training methodology based on the B_i Cell Dispersal framework for stochastic layer partitioning during fine-tuning |
| |
| ## Limitations |
| |
| - **Alpha release** β this is a research checkpoint, not a production system |
| - **Small training set** β 150 scenarios provides format and domain grounding but limited tactical depth. Future versions will incorporate augmented datasets with multi-model generated reasoning |
| - **Base model thinking mode** β Qwen3's pretrained `<think>` generation pattern can override the structured output format. Use `enable_thinking=False` in generation config for cleaner output |
| - **Not a weapon system** β this model performs analysis and reasoning. It does not control, target, or actuate anything |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{shepherd-alpha-2026, |
| title={Shepherd-Alpha: Tactical Reasoning via BiCell Depth Dispersal}, |
| author={Convergent Intelligence LLC}, |
| year={2026}, |
| url={https://huggingface.co/reaperdoesntknow/Shepherd-Alpha} |
| } |
| ``` |
|
|
| ## Related Work |
|
|
| - [Structure Over Scale](https://doi.org/10.57967/hf/5165) β Foundation paper on structure-first training methodologies |
| - [DualMind Methodology](https://doi.org/10.57967/hf/5184) β Dual-cognitive-mode SFT using EXPLORE/EXAMINE tokens |
| - [Discrepancy Calculus](https://doi.org/10.57967/hf/5194) β Mathematical framework grounding BiCell dispersal theory |
| - [B_i Cell Dispersal Framework](https://convergentintel.com) β Stochastic layer freezing grounded in DISC measure theory |
|
|
| --- |
|
|
| *Convergent Intelligence LLC: Research Division* |
| *"Structure beats scale. Collaboration beats hierarchy. Observation beats theory."* |
| <!-- cix-keeper-ts:2026-04-11T16:09:32Z --> |
|
|