SimpleConvNetLite: 轻量级CIFAR-10图像分类模型

这是一个为快速训练和部署而设计的轻量级卷积神经网络模型，在CIFAR-10数据集的子集上训练，可以在CPU上10分钟内完成训练。

模型描述

SimpleConvNetLite是一个简化版的CNN模型，专为快速训练和部署而设计。模型架构简单，参数量小，可以在资源受限的环境中运行。

模型架构

SimpleConvNetLite(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=4096, out_features=64, bias=True)
  (fc2): Linear(in_features=64, out_features=10, bias=True)
)

1个卷积层（16个过滤器，3x3卷积核）
1个最大池化层
2个全连接层（64个隐藏单元）

参数总量: ~260K

训练数据

模型在CIFAR-10数据集的子集上进行训练：

只使用原始CIFAR-10数据集的20%
训练样本: 10,000张图像（原50,000的20%）
测试样本: 2,000张图像（原10,000的20%）
图像尺寸: 32x32像素，RGB 3通道
类别: 飞机、汽车、鸟、猫、鹿、狗、青蛙、马、船、卡车

训练过程

优化器: Adam (lr=0.001)
批次大小: 128
训练轮次: 2
损失函数: CrossEntropyLoss
数据预处理:
- 调整尺寸到32x32
- 标准化 (均值=[0.5, 0.5, 0.5], 标准差=[0.5, 0.5, 0.5])

训练时长

CPU (Intel i5或同等配置): 约5-10分钟
CPU (Intel i7或同等配置): 约3-5分钟
GPU (任何配置): 不到1分钟

性能指标

在CIFAR-10测试集子集上的准确率约为**50-55%**。

使用方法

使用Transformers库

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image

# 加载模型和处理器
processor = AutoImageProcessor.from_pretrained("你的用户名/simple-cnn-cifar10-lite")
model = AutoModelForImageClassification.from_pretrained("你的用户名/simple-cnn-cifar10-lite")

# 加载图像并进行预处理
image = Image.open("path_to_image.jpg")
inputs = processor(images=image, return_tensors="pt")

# 预测
outputs = model(**inputs)
predicted_class_idx = outputs.logits.argmax(-1).item()
print(f"预测类别: {model.config.id2label[predicted_class_idx]}")

使用PyTorch

import torch
from PIL import Image
import torchvision.transforms as transforms

# 定义模型结构
class SimpleConvNetLite(torch.nn.Module):
    def __init__(self, num_classes=10):
        super(SimpleConvNetLite, self).__init__()
        self.conv1 = torch.nn.Conv2d(3, 16, 3, padding=1)
        self.pool = torch.nn.MaxPool2d(2, 2)
        self.fc1 = torch.nn.Linear(16 * 16 * 16, 64)
        self.fc2 = torch.nn.Linear(64, num_classes)
        
    def forward(self, x):
        x = self.pool(torch.nn.functional.relu(self.conv1(x)))
        x = x.view(-1, 16 * 16 * 16)
        x = torch.nn.functional.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# 加载模型
model = SimpleConvNetLite()
model.load_state_dict(torch.load("pytorch_model.bin", map_location=torch.device('cpu')))
model.eval()

# 图像预处理
transform = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# 类别映射
classes = ('飞机', '汽车', '鸟', '猫', '鹿', '狗', '青蛙', '马', '船', '卡车')

# 加载图像并预测
image = Image.open("path_to_image.jpg").convert('RGB')
image_tensor = transform(image).unsqueeze(0)
with torch.no_grad():
    outputs = model(image_tensor)
    _, predicted = torch.max(outputs, 1)
    print(f"预测类别: {classes[predicted.item()]}")

优势和局限性

优势

快速训练: 在CPU上可在10分钟内完成训练
轻量级: 模型体积小，适合部署在资源受限的环境
易于理解: 简单的架构设计，适合学习和教学目的

局限性

准确率较低: 相比完整模型，精简版准确率约为50-55%
特征提取能力有限: 只有一个卷积层，特征提取能力有限
仅用于演示: 主要用于快速演示和教学，不适合生产环境

项目链接

项目代码: [GitHub仓库链接]
Hugging Face Space演示: [你的用户名/simple-cnn-cifar10-lite-demo]

许可证

MIT

本模型由[您的名字]创建，用于Hugging Face学习和演示目的。

Downloads last month: 4

Dataset used to train junler/simple-cnn-cifar10-lite

Space using junler/simple-cnn-cifar10-lite 1

Evaluation results

Accuracy on CIFAR-10 (20% subset)
self-reported

0.520