Tessera 4
The Frontier of Efficiency: ORPO-Distilled Reasoning
Tessera 4 is a specialized mini-model designed to prove that massive scale is not a requirement for world-class reasoning. By utilizing ORPO (Odds Ratio Preference Optimization) and a high-signal distillation process from DeepSeek-R1, Tessera 4 achieves frontier-level performance in logic and mathematics while remaining small enough to run on consumer hardware (8GB VRAM).
🚀 The Reasoning Breakthrough
Tessera 4 was trained with a specific focus: Logical Accuracy over General Trivia. While we purposely allowed MMLU scores to sit at 66%, the trade-off resulted in a reasoning engine that surpasses its own teacher (DeepSeek-R1) and rivals GPT-5-class thresholds on core logic benchmarks.
📊 Benchmark Comparison
| Benchmark | Tessera 4 | DeepSeek-R1 | Llama 3.1 400B |
|---|---|---|---|
| GSM8K | 95% | 80.1% (Base) | 90%+ |
| ARC-Challenge | 93% | 90-92% | 90%+ |
| MMLU | 66% | 75%+ | 85%+ |
Note: Benchmarks conducted on randomized high-signal subsets to verify zero-shot reasoning capabilities.
🛠️ Technical Specifications
- Training Duration: ~8 Hours
- Hardware: 1x RTX 3090
- Methodology: ORPO Distillation
- Optimization: Focused on Chain-of-Thought (CoT) path correction, eliminating the "verbose fluff" typical of larger reasoning models.
💻 Hardware Requirements & Format
- Format: Full 16 Bit, 3090
- VRAM: Recommended 8GB+
- Compatibility: Optimized for LM Studio, Ollama, and llama.cpp.
🧠 Reasoning Showcase
All results generated at Q4_K_M quantization (4-bit).
🔢 1. High-Precision Math (15 Factorial)
Test: Calculate 15! step-by-step. Result: 1,307,674,368,000 (100% Correct)
Tessera 4 demonstrates zero-shot numerical stability, maintaining digit precision across 14 layers of multiplication.
📐 2. Unit Conversion & Physics
Test: A train travels 60km in 45 minutes. Find the speed in km/h. Result: 80 km/h (Correct)
The model correctly identifies the need to convert minutes to hours (0.75h) before applying the distance/time formula.
👽 3. Deep Logical Branching (The 3 Aliens)
Test: A complex "Truth-Teller, Liar, Alternator" puzzle. Result: Successfully identified Z=Truth, X=Alternator, Y=Liar (Correct)
Tessera 4 successfully tracked nested state changes and caught a logical contradiction in a secondary hypothesis branch.
🚗 4. Physical Grounding (The Car Wash)
Test: 100ft walk vs drive for a car wash. Result: Drive (Correct)
The model demonstrated common-sense grounding by realizing the "car" must be physically present at the car wash, overriding the "short walking distance" heuristic.
💬 Prompt Format
To achieve the scores listed above, you must use the correct prompt template. Since this is distilled from R1, it utilizes the DeepSeek-V3/R1 style:
<|im_start|>system
You are a highly logical reasoning engine. Think step-by-step.<|im_end|>
<|im_start|>user
[Your Question Here]<|im_end|>
<|im_start|>assistant
<|thought|>
- Downloads last month
- 3