Official Release: AetherMind-KD-Student (184M)
#2
by
samerzaher80
- opened
π Official Release: AetherMind-KD-Student (184M)
Iβm proud to release AetherMind-KD-Student, a compact 184M NLI model distilled from a DeBERTa-v3 teacher.
It achieves strong results across core, adversarial, and zero-shot NLI benchmarks.
π Highlights
- 308 samples/sec on RTX 3050
- MNLI ~90.5%, SNLI ~89%
- Adversarial NLI:
- ANLI R1: 73.6%
- ANLI R2: 57.7%
- ANLI R3: 53.67%
- Zero-shot:
- SciTail: 78.83%
- RTE: ~86.3%
- HANS: ~77.7%
- XNLI (EN): 90.92%
π§ Training Data
SNLI, MNLI, ANLI (R1βR3)
π§ͺ Zero-Shot Generalization
RTE, HANS, SciTail, XNLI (English)
π Credits
Developed by Sameer S. Najm
AetherMind project β Sam IT Solutions, Iraq
Open for collaboration and feedback!
samerzaher80
changed discussion status to
closed
samerzaher80
changed discussion status to
open