File size: 4,566 Bytes
5b9ac52 b50b6f2 5b9ac52 b50b6f2 5b9ac52 b50b6f2 5b9ac52 b50b6f2 5b9ac52 b50b6f2 5b9ac52 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
---
license: mit
datasets:
- ArtifactClfDurham/OrientalMuseum-white
language:
- en
base_model:
- google/efficientnet-b0
tags:
- artifact
- museum
---
# Artifact Classification Model v2 - Best Model Usage Guide
This directory contains the improved v2 artifact classification model with state-of-the-art performance for classifying museum artifacts by both object type and material.
## Hosted Model
The best model is available on Hugging Face at: **[SpyC0der77/artifact-efficientnet](https://huggingface.co/SpyC0der77/artifact-efficientnet)**
You can use the model directly from Hugging Face without downloading it locally.
## Model Overview
The v2 model is an advanced multi-output neural network that predicts two attributes simultaneously:
- **Object Name**: The type/category of the artifact (e.g., "vase", "statue", "pottery")
- **Material**: The material composition (e.g., "ceramic", "bronze", "stone")
### Key Improvements Over v1
- **EfficientNet Backbone**: Uses EfficientNet-B0 instead of ResNet-50 for better feature extraction
- **Attention Mechanism**: Includes an attention layer to focus on relevant features
- **Advanced Training**: Incorporates CutMix augmentation, Focal Loss, and mixed precision training
- **Better Regularization**: Uses dropout and batch normalization for improved generalization
## Architecture & Usage
The v2 model uses an EfficientNet-B0 backbone with an attention mechanism for multi-output classification. It processes RGB images of artifacts and outputs predictions for both object type and material composition.
### Input
- **Format**: RGB images (224Γ224 pixels after preprocessing)
- **Preprocessing**: Resize to 256Γ256, center crop to 224Γ224, normalize with ImageNet statistics
### Output
- **Object Classification**: Predicts artifact type (e.g., "vase", "statue", "pottery")
- **Material Classification**: Predicts material composition (e.g., "ceramic", "bronze", "stone")
- **Confidence Scores**: Probability scores for each prediction
- **Format**: Dictionary with 'object_name' and 'material' logits
## Model Architecture
```python
ImprovedMultiOutputModel(
backbone: EfficientNet-B0 (pretrained)
attention: Linear(1280 β 512 β 1280) with Sigmoid
object_classifier: Linear(1280 β 1024 β 512 β num_object_classes)
material_classifier: Linear(1280 β 1024 β 512 β num_material_classes)
)
```
### Input Requirements
- **Image Size**: 224Γ224 pixels (automatically resized and cropped)
- **Format**: RGB images
- **Normalization**: ImageNet normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
### Output Format
Returns a dictionary with:
- `'object_name'`: Logits for object classification
- `'material'`: Logits for material classification
## Training Details
The model was trained with the following configuration:
- **Dataset**: ArtifactClfDurham/OrientalMuseum-white
- **Training Split**: 85% of data
- **Validation Split**: 15% of data
- **Batch Size**: 32
- **Epochs**: 20
- **Optimizer**: AdamW with differential learning rates
- Backbone: 2e-4 (0.1Γ base LR)
- Heads: 2e-3 (base LR)
- **Augmentation**: Advanced (CutMix, rotation, color jitter, Gaussian blur)
- **Loss Function**: Cross-Entropy (or Focal Loss if enabled)
- **Scheduler**: Cosine annealing with warmup
### Advanced Training Features
- **CutMix Augmentation**: Randomly mixes image patches between samples
- **Focal Loss**: Addresses class imbalance (optional)
- **Mixed Precision**: Automatic mixed precision training for speed
- **Gradient Scaling**: Prevents gradient underflow
- **Early Stopping**: Saves best model based on validation accuracy
## Troubleshooting
### Common Issues
1. **CUDA Out of Memory**
- Reduce batch size: `--batch_size 8`
- Use CPU: Set device to "cpu"
2. **Import Errors**
- Ensure all dependencies are installed
- Check Python path includes project root
3. **Model Loading Errors**
- Verify the model file path is correct
- Ensure PyTorch version compatibility
4. **Low Confidence Scores**
- Model may not be trained on similar artifacts
- Check image preprocessing matches training setup
### Performance Tips
- Use GPU for faster inference
- Process images in batches for efficiency
- Consider model quantization for deployment
## Model Limitations
- Trained specifically on Oriental Museum artifacts
- May not generalize well to artifacts from other cultures/regions
- Performance depends on image quality and lighting
- Multi-output nature may have trade-offs between object and material accuracy
|