Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,10 +1,101 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: static
|
| 7 |
-
pinned:
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: CoDynamics Lab Corporation
|
| 3 |
+
emoji: ⚡
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: indigo
|
| 6 |
sdk: static
|
| 7 |
+
pinned: true
|
| 8 |
+
thumbnail: >-
|
| 9 |
+
https://cdn-uploads.huggingface.co/production/uploads/694646634c20c7f3d0f2eaf3/7CLHvdkIhItLCqpe7_u3W.png
|
| 10 |
---
|
| 11 |
|
| 12 |
+
<div align="center">
|
| 13 |
+
<img src="https://www.codynamicslab.com/logo.png" alt="CoDynamics Lab" width="180" />
|
| 14 |
+
<h1>CoDynamics Lab Corporation</h1>
|
| 15 |
+
<p><strong>Eliminating the Long-Context Tax in enterprise AI.</strong></p>
|
| 16 |
+
|
| 17 |
+
<a href="https://www.codynamicslab.com">🌐 Website</a> |
|
| 18 |
+
<a href="mailto:mike@codynamicslab.com">✉️ Contact</a> |
|
| 19 |
+
<a href="https://huggingface.co/CoDynamicsLab/LATCH-Qwen2.5-14B">🔒 Request Model Access</a>
|
| 20 |
+
</div>
|
| 21 |
+
|
| 22 |
+
---
|
| 23 |
+
|
| 24 |
+
## What We Build
|
| 25 |
+
|
| 26 |
+
Standard LLMs impose a compounding penalty as context grows — linear prefill cost, high latency, and expensive re-ingestion every single query. We built **LATCH** (Latent Activation Token Cache Handoff) to eliminate this entirely.
|
| 27 |
+
|
| 28 |
+
LATCH is a proprietary, model-agnostic inference layer that compiles documents once into a persistent latent representation and hands it directly into the decode path — achieving **constant-time performance regardless of document length**.
|
| 29 |
+
|
| 30 |
+
The result: responses that begin in under 120 milliseconds, infrastructure costs that collapse, and cross-document reasoning that scales.
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## LATCH Performance — Verified Results
|
| 35 |
+
|
| 36 |
+
| Model Family | Status | Avg. TTFT Speedup | E2E Speedup | Multi-Doc Pass Rate |
|
| 37 |
+
|---|---|---|---|---|
|
| 38 |
+
| **Qwen 2.5 14B** | ✅ Production Ready | **42.9×** | **12.9×** | **91.7% (11/12)** |
|
| 39 |
+
| **Mistral Nemo 12B** | ✅ Verified | **104.0×** | TBD | **83.3% (10/12)** |
|
| 40 |
+
| **Llama 3.1 8B** | ✅ Verified | **116.3×** | TBD | **83.3% (10/12)** |
|
| 41 |
+
| **DeepSeek R1 Distill** | 🔄 In Training | *Pending* | *Pending* | *Pending* |
|
| 42 |
+
|
| 43 |
+
> **Headline:** Time-To-First-Token on Qwen 2.5 14B reduced from **23.1s → 0.11s** (210× improvement on cold load).
|
| 44 |
+
> **Persistent cache reload:** **0.0016s** — a 246× speedup over standard re-ingestion.
|
| 45 |
+
|
| 46 |
+
---
|
| 47 |
+
|
| 48 |
+
## The Economics
|
| 49 |
+
|
| 50 |
+
Every query against an uncompiled document re-pays the full prefill cost. LATCH breaks this model.
|
| 51 |
+
|
| 52 |
+
- **Break-even at 0.0051 queries** — the cost of compilation is recovered in the first fraction of a single query
|
| 53 |
+
- **~30 GB VRAM** to run Qwen 2.5 14B (vs ~61 GB baseline) — more models per node
|
| 54 |
+
- Strongest amortization case: **28.5× end-to-end cost reduction** at scale
|
| 55 |
+
|
| 56 |
+
For teams running high-volume document analysis — M&A due diligence, legal review, compliance monitoring, financial research — this is a structural cost advantage, not a marginal one.
|
| 57 |
+
|
| 58 |
+
---
|
| 59 |
+
|
| 60 |
+
## Built For
|
| 61 |
+
|
| 62 |
+
| Use Case | What LATCH Changes |
|
| 63 |
+
|---|---|
|
| 64 |
+
| **M&A / Private Equity Due Diligence** | Compile the data room once. Query hundreds of documents in seconds per session. |
|
| 65 |
+
| **Legal Document Review** | Cross-contract reasoning at constant latency across large clause sets. |
|
| 66 |
+
| **Compliance & Regulatory Analysis** | Persistent document memory means re-runs are nearly free. |
|
| 67 |
+
| **Financial Research** | Multi-document synthesis with sub-second response on dense filings. |
|
| 68 |
+
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
+
## Deployment Options
|
| 72 |
+
|
| 73 |
+
**🔒 Self-Hosted License — $79**
|
| 74 |
+
Locked model weights + inference runtime for your own A100/H100 infrastructure. Data never leaves your environment. License key delivery via [Gumroad](#) after purchase.
|
| 75 |
+
|
| 76 |
+
**☁️ Managed Instance — from $5.00/hr**
|
| 77 |
+
Spin up a LATCH-ready GPU instance directly. Includes batch JSON query interface — upload your document set, submit a structured prompt list, export results. Billed by wall-clock second. Coming soon.
|
| 78 |
+
|
| 79 |
+
---
|
| 80 |
+
|
| 81 |
+
## Architecture: CDLaC + LATCH
|
| 82 |
+
|
| 83 |
+
The LATCH compilation method and neural representation format are proprietary.
|
| 84 |
+
|
| 85 |
+
---
|
| 86 |
+
|
| 87 |
+
## Licensing
|
| 88 |
+
|
| 89 |
+
CoDynamics Lab Corporation operates under a **Proprietary & Commercial Licensing** model.
|
| 90 |
+
|
| 91 |
+
- **Gated Access:** Model weights and inference adapters are provided via approved repository requests only
|
| 92 |
+
- **Commercial Use:** Production or commercial deployment requires a separate license agreement
|
| 93 |
+
- **Research Inquiries:** Academic or research access requests considered case-by-case
|
| 94 |
+
|
| 95 |
+
---
|
| 96 |
+
|
| 97 |
+
<div align="center">
|
| 98 |
+
<strong>Commercial Inquiries & Gated Access Requests</strong><br/>
|
| 99 |
+
<a href="mailto:mike@codynamicslab.com">mike@codynamicslab.com</a> |
|
| 100 |
+
<a href="https://www.codynamicslab.com">www.codynamicslab.com</a>
|
| 101 |
+
</div>
|