Add safetensors models (secure format) and update documentation

- Add best_model_finetuned.safetensors (98.33% accuracy)
- Add best_model_simple.safetensors (93% accuracy)
- Update inference.py to support both .pth and .safetensors
- Update README with security information
- Add safetensors to requirements.txt
- Safetensors format avoids pickle vulnerabilities

Files changed (6) hide show

.gitattributes +1 -0
README.md +57 -10
best_model_finetuned.safetensors +3 -0
best_model_simple.safetensors +3 -0
inference.py +111 -12
requirements.txt +1 -0

.gitattributes CHANGED Viewed

@@ -1,5 +1,6 @@
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text

 *.pth filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -45,6 +45,17 @@ This model is designed for marine biologists, oceanographers, researchers, and c
 - **Framework**: PyTorch 2.0+
 - **Parameters**: ~11M parameters
 - **Training Time**: ~10 minutes (4 epochs)
 ## Categories
@@ -86,19 +97,20 @@ The model classifies underwater sounds into four distinct categories:
 ### Installation
 ```bash
-pip install torch torchaudio librosa numpy
 ```
-### Quick Start
 ```python
 import torch
 import librosa
 import numpy as np
-# Load model
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-checkpoint = torch.load("best_model_finetuned.pth", map_location=device)
 # Load and process audio
 audio_path = "underwater_sound.wav"
@@ -127,6 +139,27 @@ class_names = ["vessel", "marine_animal", "natural_sound", "other_anthropogenic"
 print(f"Prediction: {class_names[predicted_class]} ({confidence*100:.2f}%)")
 ```
 ### Using the Complete Pipeline
 For a full-featured implementation with preprocessing and JSON output:
@@ -139,8 +172,8 @@ cd underwater-audio-classifier
 # Install dependencies
 pip install -r requirements.txt
-# Run prediction
-python predict_minimal.py --audio your_audio.wav --model models/best_model_finetuned.pth
 # Generate UDA-compliant JSON
 python generate_json.py --audio your_audio.wav --output result.json
@@ -248,19 +281,33 @@ If you use this model in your research, please cite:
 ## Model Variants
-This repository includes three model variants:
-1. **best_model_finetuned.pth** (Recommended)
    - Fine-tuned ResNet18
    - 98.33% accuracy
    - Best overall performance
-2. **best_model_simple.pth**
    - Custom CNN trained from scratch
    - 93% accuracy
    - Lighter weight alternative
-3. **Marine 1.mlmodel**
    - CoreML format for iOS/macOS deployment
    - Optimized for Apple devices

 - **Framework**: PyTorch 2.0+
 - **Parameters**: ~11M parameters
 - **Training Time**: ~10 minutes (4 epochs)
+- **Format**: Available in both safetensors (recommended) and PyTorch formats
+### 🔒 Security Note
+This model is available in **safetensors** format, which is the recommended secure format that avoids pickle vulnerabilities. The safetensors format provides:
+- ✅ No arbitrary code execution risks
+- ✅ Fast loading times
+- ✅ Memory-efficient
+- ✅ Cross-platform compatibility
+We recommend using the `.safetensors` files for production use.
 ## Categories
 ### Installation
 ```bash
+pip install torch torchaudio librosa numpy safetensors
 ```
+### Quick Start (Recommended - Safetensors)
 ```python
 import torch
 import librosa
 import numpy as np
+from safetensors.torch import load_file
+# Load model (secure format)
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+state_dict = load_file("best_model_finetuned.safetensors", device=str(device))
 # Load and process audio
 audio_path = "underwater_sound.wav"
 print(f"Prediction: {class_names[predicted_class]} ({confidence*100:.2f}%)")
 ```
+### Using the Inference Class (Easiest)
+```python
+from huggingface_hub import hf_hub_download
+from inference import Marine1Classifier
+# Download model (safetensors format - secure!)
+model_path = hf_hub_download(
+    repo_id="shiv207/Marine1",
+    filename="best_model_finetuned.safetensors"
+)
+# Initialize classifier
+classifier = Marine1Classifier(model_path)
+# Make prediction
+result = classifier.predict("underwater_sound.wav")
+print(f"Prediction: {result['predicted_class']}")
+print(f"Confidence: {result['confidence']*100:.2f}%")
+```
 ### Using the Complete Pipeline
 For a full-featured implementation with preprocessing and JSON output:
 # Install dependencies
 pip install -r requirements.txt
+# Run prediction (supports both .pth and .safetensors)
+python predict_minimal.py --audio your_audio.wav --model models/best_model_finetuned.safetensors
 # Generate UDA-compliant JSON
 python generate_json.py --audio your_audio.wav --output result.json
 ## Model Variants
+This repository includes multiple model formats:
+### Safetensors Format (🔒 Recommended - Secure)
+1. **best_model_finetuned.safetensors** ⭐
    - Fine-tuned ResNet18
    - 98.33% accuracy
+   - Secure format (no pickle vulnerabilities)
    - Best overall performance
+2. **best_model_simple.safetensors**
    - Custom CNN trained from scratch
    - 93% accuracy
    - Lighter weight alternative
+   - Secure format
+### Legacy Formats
+3. **best_model_finetuned.pth**
+   - PyTorch pickle format (legacy)
+   - Use safetensors version instead
+4. **best_model_simple.pth**
+   - PyTorch pickle format (legacy)
+   - Use safetensors version instead
+5. **Marine 1.mlmodel**
    - CoreML format for iOS/macOS deployment
    - Optimized for Apple devices

best_model_finetuned.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:348e72f30c774807db9ed46fdab3508448ab9440009d1033863aa81eda533218
+size 45262356

best_model_simple.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:43beb5722867e95ef4b1c6361d209d2d688e96e4a1e388c6e721180ee5ae1d3d
+size 2479820

inference.py CHANGED Viewed

@@ -1,6 +1,7 @@
 """
 Marine1 Underwater Acoustic Classifier - Inference Script
 Simple example for using the model with Hugging Face
 """
 import torch
@@ -10,6 +11,13 @@ from typing import Dict, Tuple
 import warnings
 warnings.filterwarnings('ignore')
 class Marine1Classifier:
     """Underwater acoustic classifier using Marine1 model"""
@@ -19,7 +27,7 @@ class Marine1Classifier:
         Initialize the classifier
         Args:
-            model_path: Path to the .pth model file
             device: Device to run on ('cuda', 'cpu', or 'mps'). Auto-detected if None.
         """
         if device is None:
@@ -33,25 +41,116 @@ class Marine1Classifier:
         self.device = torch.device(device)
         print(f"Using device: {self.device}")
-        # Load checkpoint
-        checkpoint = torch.load(model_path, map_location=self.device, weights_only=False)
-        # Get class mapping
-        self.class_to_id = checkpoint['class_to_id']
         self.id_to_class = {v: k for k, v in self.class_to_id.items()}
         self.class_names = [self.id_to_class[i] for i in range(len(self.id_to_class))]
-        # Load model
-        from torchvision import models
-        self.model = models.resnet18(weights=None)
-        self.model.conv1 = torch.nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
-        self.model.fc = torch.nn.Linear(self.model.fc.in_features, len(self.class_names))
-        self.model.load_state_dict(checkpoint['model_state_dict'])
         self.model.to(self.device)
         self.model.eval()
-        print(f"Model loaded successfully with {len(self.class_names)} classes")
     def process_audio(self, audio_path: str, sr: int = 16000, duration: float = 10.0) -> np.ndarray:
         """

 """
 Marine1 Underwater Acoustic Classifier - Inference Script
 Simple example for using the model with Hugging Face
+Supports both .pth (pickle) and .safetensors formats
 """
 import torch
 import warnings
 warnings.filterwarnings('ignore')
+try:
+    from safetensors.torch import load_file
+    SAFETENSORS_AVAILABLE = True
+except ImportError:
+    SAFETENSORS_AVAILABLE = False
+    print("Warning: safetensors not installed. Install with: pip install safetensors")
 class Marine1Classifier:
     """Underwater acoustic classifier using Marine1 model"""
         Initialize the classifier
         Args:
+            model_path: Path to the model file (.pth or .safetensors)
             device: Device to run on ('cuda', 'cpu', or 'mps'). Auto-detected if None.
         """
         if device is None:
         self.device = torch.device(device)
         print(f"Using device: {self.device}")
+        # Determine file format
+        is_safetensors = model_path.endswith('.safetensors')
+        if is_safetensors:
+            if not SAFETENSORS_AVAILABLE:
+                raise ImportError("safetensors not installed. Install with: pip install safetensors")
+            print(f"Loading safetensors model (secure format)...")
+            # Load safetensors
+            state_dict = load_file(model_path, device=str(self.device))
+            # Parse metadata
+            from safetensors import safe_open
+            with safe_open(model_path, framework="pt", device=str(self.device)) as f:
+                metadata = f.metadata()
+            # Get class mapping from metadata
+            import ast
+            self.class_to_id = ast.literal_eval(metadata.get('class_to_id', "{}"))
+            if not self.class_to_id:
+                # Default mapping
+                self.class_to_id = {
+                    'vessel': 0, 'marine_animal': 1,
+                    'natural_sound': 2, 'other_anthropogenic': 3
+                }
+        else:
+            print(f"Loading PyTorch model (.pth format)...")
+            # Load checkpoint
+            checkpoint = torch.load(model_path, map_location=self.device, weights_only=False)
+            # Get class mapping
+            self.class_to_id = checkpoint['class_to_id']
+            state_dict = checkpoint['model_state_dict']
         self.id_to_class = {v: k for k, v in self.class_to_id.items()}
         self.class_names = [self.id_to_class[i] for i in range(len(self.id_to_class))]
+        # Load model architecture (custom fine-tuned ResNet18)
+        self.model = self._create_model_architecture(len(self.class_names))
+        # Load weights
+        self.model.load_state_dict(state_dict)
         self.model.to(self.device)
         self.model.eval()
+        format_type = "safetensors (secure)" if is_safetensors else "PyTorch (.pth)"
+        print(f"✅ Model loaded successfully ({format_type})")
+        print(f"   Classes: {len(self.class_names)}")
+    def _create_model_architecture(self, num_classes: int):
+        """Create the model architecture matching the trained model"""
+        import torch.nn as nn
+        from torchvision import models
+        class LightweightFineTuned(nn.Module):
+            def __init__(self, num_classes=4):
+                super(LightweightFineTuned, self).__init__()
+                resnet = models.resnet18(weights=None)
+                # Adapt first layer for grayscale spectrograms
+                self.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
+                self.bn1 = resnet.bn1
+                self.relu = resnet.relu
+                self.maxpool = resnet.maxpool
+                self.layer1 = resnet.layer1
+                self.layer2 = resnet.layer2
+                self.layer3 = resnet.layer3
+                self.layer4 = resnet.layer4
+                self.avgpool = resnet.avgpool
+                self.classifier = nn.Sequential(
+                    nn.Dropout(0.5),
+                    nn.Linear(512, 256),
+                    nn.ReLU(),
+                    nn.Dropout(0.25),
+                    nn.Linear(256, num_classes)
+                )
+                self.confidence_head = nn.Sequential(
+                    nn.Linear(512, 1),
+                    nn.Sigmoid()
+                )
+            def forward(self, x, return_confidence=False):
+                if len(x.shape) == 3:
+                    x = x.unsqueeze(1)
+                x = self.conv1(x)
+                x = self.bn1(x)
+                x = self.relu(x)
+                x = self.maxpool(x)
+                x = self.layer1(x)
+                x = self.layer2(x)
+                x = self.layer3(x)
+                x = self.layer4(x)
+                x = self.avgpool(x)
+                features = torch.flatten(x, 1)
+                logits = self.classifier(features)
+                if return_confidence:
+                    confidence = self.confidence_head(features)
+                    return logits, confidence
+                return logits
+        return LightweightFineTuned(num_classes=num_classes)
     def process_audio(self, audio_path: str, sr: int = 16000, duration: float = 10.0) -> np.ndarray:
         """

requirements.txt CHANGED Viewed

@@ -5,3 +5,4 @@ librosa>=0.10.0
 numpy>=1.24.0
 scipy>=1.10.0
 soundfile>=0.12.0

 numpy>=1.24.0
 scipy>=1.10.0
 soundfile>=0.12.0
+safetensors>=0.4.0