SwarmAtlas-27B
Capital markets intelligence model. Trained on 45,039 curated CRE pairs. Underwrites real deals at institutional grade β 12/12 math accuracy on live validation.
Model Description
SwarmAtlas-27B is a domain-specific language model fine-tuned for commercial real estate capital markets intelligence. Built by Swarm & Bee, it transforms raw deal parameters into institutional-grade underwriting, IC memos, waterfall analyses, and capital stack recommendations.
The thesis: We don't sell models. We sell verified training data. SwarmAtlas exists to prove the data is bankable β and it did. On a live CRE deal stress test, it scored 12 out of 12 on mathematical accuracy and correctly identified the structural deal-killer that would have cost the LP their preferred return.
Key Facts
| Attribute | Value |
|---|---|
| Base Model | Qwen/Qwen3.5-27B Dense |
| Architecture | Gated Delta Networks (75% GDN + 25% Standard Attention) |
| Parameters | 27B (all active, dense) |
| Hidden Dim | 5,120 |
| Layers | 64 |
| Vocab | 248,320 |
| Context | 16,384 tokens (training) / 262K native / 1M via YaRN |
| Training Method | bf16 LoRA r=64 alpha=32 |
| Training Steps | 844 |
| Training Loss | 0.4186 |
| Eval Loss | 0.2238 |
| Training Time | 29.32 hours |
| Training GPU | NVIDIA RTX PRO 6000 Blackwell (96GB) |
| Serving | vLLM 0.17.0, 88 tok/s @ 4 concurrent |
Training Data
45,039 capital markets training pairs assembled from 5 pools:
| Pool | Share | Pairs | Content |
|---|---|---|---|
| Diversified | 60% | 27,000 | CMBS, rate advisory, equity structuring, valuation |
| RPA (Risk) | 25% | 11,200 | Risk-weighted scenarios, stress tests, tail events |
| Macro Graph | 8% | 3,600 | Macroeconomic causality chains, deal relationship graphs |
| Golden | 4% | 1,800 | Hand-verified exemplars from production signals |
| Mutations | 3% | 1,400 | Deliberately perturbed scenarios for robustness |
Cook Streams
| Stream | Description |
|---|---|
| Debt Maturity | CMBS loan maturity, refinancing, balloon risk |
| CMBS Distress | Special servicing, workouts, REO dispositions |
| Rate Advisory | Interest rate hedging, swap analysis, forward curves |
| Equity Advisory | JV structuring, promote waterfalls, GP/LP economics |
| Valuation | DCF, direct cap, sales comparison, cost approach |
| Deal Origination | Pipeline management, broker opinion of value |
| Macro Causality | Fed policy impact, yield curve analysis, CRE cycles |
| Deal Graph | Entity relationships, capital stack mapping, ownership chains |
Reasoning Tiers
| Tier | Capability |
|---|---|
| Bronze | NOI calculation, cap rate derivation |
| Silver | Rent roll analysis, occupancy modeling, loss-to-lease |
| Gold | Waterfall distribution, refi analysis, capital stack structuring |
| Platinum | Stress testing, IC recommendation, kill/defend decision |
Validation β The Memphis IC Test
SwarmAtlas was validated on a real CRE deal stress test:
Deal: 312-unit Class B Multifamily β Memphis, TN
Basis: $14.2M
Financing: 80% LTC bridge loan @ 8.35%
Exit Cap: 5.75%
Results:
| Metric | Result |
|---|---|
| Math Accuracy | 12/12 (zero errors) |
| Verdict | NUKED β correct kill decision |
| Structural Flaw | Leverage compression (80% LTC in -> 65% LTV out) + 8.35% bridge carry = LP doesn't clear 8% pref at 5.75% exit cap |
| Institutional Detail | Model added 5% soft cost buffer (standard practice, not in prompt) |
| Output | 10,220 tokens β complete waterfall analysis |
When your model can underwrite a real deal, get every number right, and correctly identify the structural deal-killer β that's not fine-tuning. That's institutional intelligence.
Training Configuration
# SwarmAtlas-27B Gold Standard Config
base_model: Qwen/Qwen3.5-27B
method: bf16 LoRA (no QLoRA β higher quantization error on Qwen3.5)
lora_r: 64
lora_alpha: 32
learning_rate: 1e-5
scheduler: cosine
warmup: 5%
weight_decay: 0.01
effective_batch_size: 32 (batch=2, grad_accum=16)
max_seq_len: 4096
epoch_fraction: 0.6
early_stopping: patience=3 on eval_loss
packing: true
framework: Unsloth + TRL SFTTrainer
tokenizer: AutoTokenizer (bypass for Qwen3.5 VL dispatch bug)
Loss Curve
Step 10: ββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.051
Step 50: βββββββββββββββββββββββββββββββββββββ 0.742
Step 100: ββββββββββββββββββββββββββββββ 0.598
Step 200: ββββββββββββββββββββββββββ 0.522 (eval: 0.533)
Step 400: ββββββββββββββββββββββββ 0.470 (eval: 0.269)
Step 600: ββββββββββββββ 0.290 (eval: 0.227)
Step 800: βββββββββββββ 0.270 (eval: 0.224)
Step 844: βββββββββββββ 0.266
Final eval loss: 0.2238 β strong convergence with no overfitting.
Quality Pipeline
Every training pair passes through Swarm & Bee's 6-gate deterministic pipeline:
- Schema Gate β valid JSONL, required fields present
- Length Gate β answer meets minimum depth threshold (500 chars text, 20 chars JSON)
- Duplication Gate β MD5 fingerprint-based dedup across all shards
- Specialty Gate β verified against capital markets taxonomy
- Coherence Gate β question-answer alignment scoring
- Toxicity Gate β safety and compliance filter
Pairs that pass all 6 gates enter CoVe promotion:
- Llama-70B rewrites for clarity and completeness
- Qwen-235B scores on accuracy, completeness, structure, relevance, sft_quality
- Minimum 20/25 total score, all criteria >= 3, accuracy >= 4
Provenance: Every batch is Merkle-hashed and published to Hedera Consensus Service (HCS) for immutable audit trail. EU AI Act Article 53(1)(d) compliant.
Usage
API Access
SwarmAtlas-27B is served via an OpenAI-compatible API:
from openai import OpenAI
client = OpenAI(
base_url="https://api.swarmandbee.ai/v1",
api_key="YOUR_API_KEY" # Get key at swarmandbee.ai/datasets
)
response = client.chat.completions.create(
model="swarm/atlas-27b",
messages=[
{
"role": "system",
"content": "You are SwarmAtlas, a capital markets intelligence model."
},
{
"role": "user",
"content": (
"Underwrite this deal: 120,000 SF industrial warehouse in Dallas, "
"listed at $18.5M, 5.8% cap rate, 3PL tenant on 15-year NNN lease "
"with 2.5% annual escalations."
)
}
]
)
print(response.choices[0].message.content)
Training Data
The full training dataset is available via API:
curl -H "Authorization: Bearer YOUR_API_KEY" \
https://api.swarmandbee.ai/api/data/pull?dataset=capital-markets-intelligence&limit=100
See SwarmandBee/capital-markets-intelligence for dataset details.
Free Sample
This repository includes 1,000 free CRE training pairs in samples/cre_sample_1000.jsonl. These are real production pairs from the Swarm & Bee data estate β not synthetic demos.
import json
with open("samples/cre_sample_1000.jsonl") as f:
pairs = [json.loads(line) for line in f]
print(f"Loaded {len(pairs)} CRE pairs")
print(f"Task types: {set(p.get('task_type', 'unknown') for p in pairs)}")
The Swarm & Bee Data Estate
SwarmAtlas is trained on a subset of the Swarm & Bee intelligence estate:
| Dataset | Pairs | HuggingFace |
|---|---|---|
| CRE Intelligence | 893,348 | cre-intelligence-objects |
| Medical Intelligence | 432,196 | medical-intelligence |
| Capital Markets | 45,039 | capital-markets-intelligence |
| Aviation | 60,458 | aviation-intelligence |
| Signal Intelligence | 28,624 | signal-intelligence |
| Total | 1,459,665+ |
About Swarm & Bee
Swarm & Bee is an AI data refinery. We curate domain-specific training data, verify it with models, and sell what's proven bankable.
- Founder: Donovan Mackey β 30-year CRE veteran, national platform, $8B in closed transactions
- Hardware: NVIDIA RTX PRO 6000 Blackwell GPUs (96GB each)
- Pipeline: Signal -> Curate -> Gate -> Promote -> Verify -> Seal
- Provenance: Every pair tracked on Hedera Consensus Service
- Publications: 8 DOIs on Zenodo
| Website | swarmandbee.ai |
| API | api.swarmandbee.ai |
| build@swarmandbee.com | |
| Phone | 561-532-7120 |
Citation
@misc{mackey2026swarmatlas,
title={SwarmAtlas-27B: Capital Markets Intelligence Model},
author={Mackey, Donovan},
year={2026},
publisher={Swarm & Bee Intelligence},
url={https://swarmandbee.ai},
note={Trained on 45,039 curated capital markets pairs. Loss 0.4186. 12/12 math accuracy on live CRE deal validation.}
}
License
Apache 2.0 β commercial use permitted. See LICENSE for details.
The included sample data (samples/cre_sample_1000.jsonl) is released under the same Apache 2.0 license. Full dataset access requires an API key from swarmandbee.ai.
Model tree for SwarmandBee/SwarmAtlas-27B
Base model
Qwen/Qwen3.5-27BEvaluation results
- Training Loss on Swarm Capital Markets Intelligencetest set self-reported0.419
- Eval Loss on Swarm Capital Markets Intelligencetest set self-reported0.224
- Math Accuracy (12/12) on Swarm Capital Markets Intelligencetest set self-reported1.000