sosigikiller commited on
Commit
3a73e22
·
2 Parent(s): 3cce567 46c9ff3

Merge branch 'main' of https://huggingface.co/Hyunil/CSATv2

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - ILSVRC/imagenet-1k
4
+ metrics:
5
+ - accuracy
6
+ ---
7
+ CSATv2
8
+
9
+ CSATv2 is a lightweight high-resolution vision backbone designed to maximize throughput at 512×512 resolution.
10
+ By applying frequency-domain projection at the input stage, the model suppresses redundant spatial information and achieves extremely fast inference.
11
+
12
+ ## Highlights
13
+
14
+ - 🚀 **2,800 images/s at 512×512 resolution (A6000 1×GPU)**
15
+ - ⚡ **Frequency-Domain Projection** for removing redundant spatial information
16
+ - 🎯 **80.02%** ImageNet-1K Top-1 Accuracy
17
+ - 🪶 Only **11M parameters**
18
+ - 🧩 Suitable for **image classification** or as a **high-throughput detection backbone**
19
+
20
+ This model is an improved version of the architecture used in the [paper](https://www.mdpi.com/2306-5354/10/11/1279)
21
+
22
+ Special thanks to **Demino** for contributing ideas and feedback that greatly helped in lightweighting and optimizing the model.
23
+
24
+ Model description
25
+
26
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/633a801b7646c9f51a05cc92/pynK0OWbjH5WUlu8L7OTj.png)
27
+
28
+ This model is designed primarily for image classification tasks and can also serve as a high-throughput backbone for object detection.
29
+ ```python
30
+ import torch
31
+ from datasets import load_dataset
32
+ from transformers import AutoImageProcessor, AutoModelForImageClassification
33
+
34
+ # 예시 데이터: 고양이 이미지
35
+ dataset = load_dataset("huggingface/cats-image")
36
+ image = dataset["test"]["image"][0]
37
+
38
+ # 👉 CSATv2 모델로 교체
39
+ model_name = "Hyunil/CSATv2"
40
+
41
+ # Preprocessor + Model 로드
42
+ processor = AutoImageProcessor.from_pretrained(model_name, trust_remote_code=True)
43
+ model = AutoModelForImageClassification.from_pretrained(model_name, trust_remote_code=True)
44
+
45
+ # 전처리
46
+ inputs = processor(image, return_tensors="pt")
47
+
48
+ # 추론
49
+ with torch.no_grad():
50
+ logits = model(**inputs).logits
51
+
52
+ pred = logits.argmax(-1).item()
53
+ print("Predicted label:", model.config.id2label[pred])
54
+ ```