--- license: mit datasets: - ArtifactClfDurham/OrientalMuseum-white language: - en base_model: - google/efficientnet-b0 tags: - artifact - museum --- # Artifact Classification Model v2 - Best Model Usage Guide This directory contains the improved v2 artifact classification model with state-of-the-art performance for classifying museum artifacts by both object type and material. ## Hosted Model The best model is available on Hugging Face at: **[SpyC0der77/artifact-efficientnet](https://huggingface.co/SpyC0der77/artifact-efficientnet)** You can use the model directly from Hugging Face without downloading it locally. ## Model Overview The v2 model is an advanced multi-output neural network that predicts two attributes simultaneously: - **Object Name**: The type/category of the artifact (e.g., "vase", "statue", "pottery") - **Material**: The material composition (e.g., "ceramic", "bronze", "stone") ### Key Improvements Over v1 - **EfficientNet Backbone**: Uses EfficientNet-B0 instead of ResNet-50 for better feature extraction - **Attention Mechanism**: Includes an attention layer to focus on relevant features - **Advanced Training**: Incorporates CutMix augmentation, Focal Loss, and mixed precision training - **Better Regularization**: Uses dropout and batch normalization for improved generalization ## Architecture & Usage The v2 model uses an EfficientNet-B0 backbone with an attention mechanism for multi-output classification. It processes RGB images of artifacts and outputs predictions for both object type and material composition. ### Input - **Format**: RGB images (224×224 pixels after preprocessing) - **Preprocessing**: Resize to 256×256, center crop to 224×224, normalize with ImageNet statistics ### Output - **Object Classification**: Predicts artifact type (e.g., "vase", "statue", "pottery") - **Material Classification**: Predicts material composition (e.g., "ceramic", "bronze", "stone") - **Confidence Scores**: Probability scores for each prediction - **Format**: Dictionary with 'object_name' and 'material' logits ## Model Architecture ```python ImprovedMultiOutputModel( backbone: EfficientNet-B0 (pretrained) attention: Linear(1280 → 512 → 1280) with Sigmoid object_classifier: Linear(1280 → 1024 → 512 → num_object_classes) material_classifier: Linear(1280 → 1024 → 512 → num_material_classes) ) ``` ### Input Requirements - **Image Size**: 224×224 pixels (automatically resized and cropped) - **Format**: RGB images - **Normalization**: ImageNet normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ### Output Format Returns a dictionary with: - `'object_name'`: Logits for object classification - `'material'`: Logits for material classification ## Training Details The model was trained with the following configuration: - **Dataset**: ArtifactClfDurham/OrientalMuseum-white - **Training Split**: 85% of data - **Validation Split**: 15% of data - **Batch Size**: 32 - **Epochs**: 20 - **Optimizer**: AdamW with differential learning rates - Backbone: 2e-4 (0.1× base LR) - Heads: 2e-3 (base LR) - **Augmentation**: Advanced (CutMix, rotation, color jitter, Gaussian blur) - **Loss Function**: Cross-Entropy (or Focal Loss if enabled) - **Scheduler**: Cosine annealing with warmup ### Advanced Training Features - **CutMix Augmentation**: Randomly mixes image patches between samples - **Focal Loss**: Addresses class imbalance (optional) - **Mixed Precision**: Automatic mixed precision training for speed - **Gradient Scaling**: Prevents gradient underflow - **Early Stopping**: Saves best model based on validation accuracy ## Troubleshooting ### Common Issues 1. **CUDA Out of Memory** - Reduce batch size: `--batch_size 8` - Use CPU: Set device to "cpu" 2. **Import Errors** - Ensure all dependencies are installed - Check Python path includes project root 3. **Model Loading Errors** - Verify the model file path is correct - Ensure PyTorch version compatibility 4. **Low Confidence Scores** - Model may not be trained on similar artifacts - Check image preprocessing matches training setup ### Performance Tips - Use GPU for faster inference - Process images in batches for efficiency - Consider model quantization for deployment ## Model Limitations - Trained specifically on Oriental Museum artifacts - May not generalize well to artifacts from other cultures/regions - Performance depends on image quality and lighting - Multi-output nature may have trade-offs between object and material accuracy