SOTA-Blitz-997

Near-SOTA Precision | 7-Minute T4 Training | Safetensors Native


Model Overview

SOTA-Blitz-997 is a high-velocity Vision Transformer (ViT) architecture optimized for the MNIST handwritten digit classification task. While most "State-of-the-Art" models rely on massive ensembles and hours of GPU compute, SOTA-Blitz-997 was engineered to achieve elite accuracy within a single 7-minute training window on a standard NVIDIA T4 by leveraging the global attention mechanisms of the Transformer block.

Performance & Proof

The model achieves a verified 99.72% Test Accuracy, leaving only 28 errors out of 10,000 images. This performance exceeds the human baseline (~97.5%) and demonstrates that ViT architectures can effectively "solve" classic computer vision benchmarks with extreme efficiency.

Training Logs (Verified Convergence)

Epoch Loss Train Acc Test Acc Best Acc
05/30 0.6235 95.068% 98.440% 98.590%
10/30 0.5923 96.287% 98.840% 99.030%
15/30 0.5683 97.107% 99.220% 99.230%
20/30 0.5485 97.927% 99.460% 99.550%
25/30 0.5345 98.460% 99.660% 99.660%
30/30 0.5296 98.700% 99.720% 99.720%

Final Performance: 28 Errors / 10,000 Digits (TTA Enabled).

Technical Specifications

  • Architecture: Optimized Vision Transformer (ViT) with Patch Embedding & Attention-heads.
  • Training Hardware: NVIDIA T4 GPU (Kaggle).
  • Training Time: ~7 Minutes.
  • Format: .safetensors (Zero-copy loading, no-pickle security).
  • License: Apache 2.0.
  • Architecture Note: Based on a timm ViT-Small backbone with a custom 1-channel patch embedding layer and 32x32 input resolution.

Usage

from safetensors.torch import load_file
import torch

# Load the SOTA weights
model_weights = load_file("SOTA-Blitz-997.safetensors")

# Apply to your ViT architecture
# model.load_state_dict(model_weights)

Made By

Andy-ML-And-AI

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Andy-ML-And-AI/SOTA-Blitz-997

Evaluation results