Upload 5 files

Browse files

Files changed (5) hide show

README.md +52 -3
config.json +48 -0
preprocessor_config.json +17 -0
pytorch_model.bin +3 -0
training_args.bin +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,52 @@
----
-license: mit
----

+# 🖼️ Mô hình Phân loại Ảnh - ViT Fine-tuned trên CIFAR-10
+## 📝 Mô tả
+Đây là mô hình Vision Transformer (ViT) được fine-tuned từ mô hình khác trên tập dữ liệu CIFAR-10. Mô hình được huấn luyện để phân loại ảnh vào 10 lớp khác nhau, mỗi lớp đại diện cho một danh mục đối tượng cụ thể.
+## 📌 Nhiệm vụ
+Loại bài toán: Phân loại ảnh (Image Classification)
+Số lớp: 10 (Tương ứng nhãn) CIFAR-10
+## 📥 Đầu vào
+Định dạng: Ảnh màu RGB
+Kích thước ảnh: 224x224 pixels
+## 📤 Đầu ra
+Định dạng: Xác suất cho mỗi lớp (logits)
+Kiểu dữ liệu: Tensor có kích thước [batch_size, 10]
+Ý nghĩa: Xác suất dự đoán cho từng lớp trong 10 lớp của CIFAR-10
+🛠 Yêu cầu thư viện
+Cài đặt các thư viện cần thiết bằng:
+```bash
+pip install transformers torch torchvision
+```
+## 🧪 Cách sử dụng mô hình
+Dưới đây là ví dụ về cách sử dụng mô hình để phân loại một ảnh:
+```python
+import torch
+from transformers import ViTForImageClassification, ViTImageProcessor
+from PIL import Image
+# Tải ảnh cần phân loại
+image = Image.open("path_to_your_image.jpg")
+# Tải processor và mô hình từ Hugging Face
+processor = ViTImageProcessor.from_pretrained("zhaospei/Model_7")
+model = ViTForImageClassification.from_pretrained("zhaospei/Model_7")
+# Xử lý đầu vào
+inputs = processor(images=image, return_tensors="pt")
+# Dự đoán với mô hình
+with torch.no_grad():
+    outputs = model(**inputs)
+    logits = outputs.logits
+    predicted_label = logits.argmax(-1).item()
+print(f"Nhãn dự đoán: {model.config.id2label[predicted_label]}")
+```

config.json ADDED Viewed

	@@ -0,0 +1,48 @@

+{
+  "_name_or_path": "google/vit-base-patch16-224",
+  "architectures": [
+    "ViTForImageClassification"
+  ],
+  "attention_probs_dropout_prob": 0.0,
+  "encoder_stride": 16,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.0,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "airplane",
+    "1": "automobile",
+    "2": "bird",
+    "3": "cat",
+    "4": "deer",
+    "5": "dog",
+    "6": "frog",
+    "7": "horse",
+    "8": "ship",
+    "9": "truck"
+  },
+  "image_size": 224,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "airplane": 0,
+    "automobile": 1,
+    "bird": 2,
+    "cat": 3,
+    "deer": 4,
+    "dog": 5,
+    "frog": 6,
+    "horse": 7,
+    "ship": 8,
+    "truck": 9
+  },
+  "layer_norm_eps": 1e-12,
+  "model_type": "vit",
+  "num_attention_heads": 12,
+  "num_channels": 3,
+  "num_hidden_layers": 12,
+  "patch_size": 16,
+  "problem_type": "single_label_classification",
+  "qkv_bias": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.22.1"
+}

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "do_normalize": true,
+  "do_resize": true,
+  "feature_extractor_type": "ViTFeatureExtractor",
+  "image_mean": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "image_std": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "resample": 2,
+  "size": 224
+}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:92c692e53bf22eeaf5cb06bc59e3d5b7ff8d3a79d38f5f3a17283596f25f06bf
+size 343291569

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:85de8c2443edd63bab0d98c36637cbba93f17087b2ab26eec8dbc45cf21313eb
+size 3375