--- datasets: - fuliucansheng/pascal_voc metrics: - roc_auc base_model: - BobMcDear/swin_s3_base_224 pipeline_tag: image-classification license: apache-2.0 --- # ðŸĶĒ Swin S3 Base (224) - Pascal VOC A Swin S3 Base model fine-tuned on the Pascal VOC 2012 dataset for multi-class image classification. --- ## 🧠 Model Details - **Architecture**: Swin S3 Base (`224x224` input size) - **Pretrained on**: ImageNet-1k - **Fine-tuned on**: Pascal VOC 2012 - **Framework**: PyTorch (`timm` implementation) - **Format**: `safetensors` --- ## ðŸŽŊ Intended Use - **Primary task**: Image classification of natural scenes featuring objects from 20 Pascal VOC categories. - **Users**: Researchers, developers working on computer vision applications, model benchmarking. - **Not intended for**: Real-time decision making in critical applications (e.g., autonomous vehicles, medical diagnosis). --- ## ⚠ïļ Limitations and Ethical Considerations - **Biases**: The model inherits biases present in Pascal VOC, such as underrepresentation of certain object types, contexts, or demographics. It may perform poorly on out-of-distribution samples. - **Ethical Use**: Avoid using this model for applications that could reinforce harmful stereotypes, cause social harm, or violate privacy (e.g., surveillance). - **Transparency**: This model is shared for research and educational use and should not be deployed without thorough fairness, robustness, and security evaluations. --- ## ⚙ïļ Training Details - **Training library**: `timm` + PyTorch - **Epochs**: 5 - **Batch size**: 16 - **Optimizer**: AdamW - **Learning rate**: 5e-5 - **Scheduler**: Cosine Annealing - **Loss function**: BCE - **Hardware**: 1x NVIDIA A100 on Google Colab Pro > â„đïļ [Link to experiment tracking dashboard (e.g., Weights & Biases)](https://wandb.ai/your-project/your-run-id) *(optional)* --- ## 📊 Evaluation Results Evaluated on Pascal VOC 2012 test set: | Metric | Value | |----------------|-------------| |roc_auc | 98.9% | > *Note: Evaluation performed using standard multi-class metrics. Model was not evaluated on cross-domain generalization.* --- ## 📚 Dataset - **Name**: Pascal VOC 2012 - **License**: Creative Commons Attribution 4.0 International - **Labels**: 20 object categories (person, car, dog, etc.) - **Split used**: Training for fine-tuning, validation for evaluation --- ## ðŸ’ū Files in This Repository - `model.safetensors`: Model weights - `README.md`: Model card (this file) --- ## 🔗 Citations ```bibtex @inproceedings{liu2021swin, title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows}, author={Liu, Ze and Lin, Yutong and Cao, Yu and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining}, booktitle={ICCV}, year={2021} } @article{Everingham10, author = {Everingham, M. and Van Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.}, title = {The Pascal Visual Object Classes (VOC) Challenge}, journal = {IJCV}, year = {2010}, volume = {88}, number = {2}, pages = {303--338} }