ANISH-j
/

EGAI-vision-encoder

Safetensors

Model card Files Files and versions

xet

Community

ANISH-j commited on Sep 29, 2025

Commit

dc4e3e8

verified ·

1 Parent(s): 5df6d26

Upload 2 files

Browse files

Files changed (2) hide show

v2/README.md +95 -0
v2/infer.py +84 -0

v2/README.md ADDED Viewed

	@@ -0,0 +1,95 @@

+# Minimal Inference Setup
+This project provides a lightweight setup for running inference with a pre-trained model.
+It contains the model configuration, trained weights, and a Python script to perform inference.
+---
+## Project Structure
+```
+.
+├── model/
+│   ├── config.json         # Model configuration file
+│   ├── model.safetensors   # Pre-trained model weights
+└── infer.py                # Script to run inference on input data
+```
+---
+## Prerequisites
+- Python 3.8+
+- PyTorch
+- Transformers library
+- safetensors
+- PIL (Pillow)
+- (Optional) tkinter if a GUI is implemented in `infer.py`
+Install required packages:
+```bash
+pip install torch transformers safetensors pillow
+```
+---
+## Files Description
+### model/config.json
+Defines the architecture and hyperparameters of the model (e.g., hidden size, number of layers, vocabulary size).
+Required to correctly instantiate the model before loading the weights.
+### model/model.safetensors
+Contains the trained weights of the model.
+Stored in the Safetensors format for safety and efficiency.
+### infer.py
+Main script to perform inference with the pre-trained model.
+**Responsibilities:**
+- Loads config.json and model.safetensors
+- Preprocesses input text/image (depending on model type)
+- Runs the model forward pass
+- Outputs predictions
+**Usage:**
+```bash
+python infer.py --input "your input text or path to image"
+```
+**Example:**
+```bash
+python infer.py --input "Hello, how are you?"
+```
+---
+## Usage Workflow
+1. Place the model files (`config.json` and `model.safetensors`) inside the `model/` directory.
+2. Run `infer.py` with your desired input.
+3. The script will display the prediction/classification result.
+---
+## Notes
+- Ensure the model files are compatible (same checkpoint version).
+- For image-based models, inputs must be resized to the expected dimensions (e.g., 224x224 RGB).
+- For text-based models, ensure the tokenizer is compatible with the config (may require adding tokenizer files).
+- GPU is recommended for faster inference, but CPU is supported.
+---
+## License
+[Add license information here if applicable]
+---
+## Contributing
+[Add contribution guidelines here if applicable]

v2/infer.py ADDED Viewed

	@@ -0,0 +1,84 @@

+import tkinter as tk
+from tkinter import filedialog, messagebox
+from PIL import Image, ImageTk
+import torch
+from transformers import AutoImageProcessor, SiglipForImageClassification
+# Model and processor paths (adjust if needed; assumes final model saved in output_dir)
+model_path = "./model"  # Path to the fine-tuned model
+# Load processor and model
+processor = AutoImageProcessor.from_pretrained("./base model")
+model = SiglipForImageClassification.from_pretrained(model_path)
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+model.to(device)
+model.eval()
+# Get label mappings from model config
+id2label = model.config.id2label
+# Tkinter GUI
+class ImageClassifierApp:
+    def __init__(self, root):
+        self.root = root
+        self.root.title("SigLIP2 Gardner Grading Classifier")
+        self.root.geometry("600x600")
+        # Label for instructions
+        self.instruction_label = tk.Label(root, text="Select an image to classify")
+        self.instruction_label.pack(pady=10)
+        # Button to load image
+        self.load_button = tk.Button(root, text="Load Image", command=self.load_image)
+        self.load_button.pack(pady=10)
+        # Canvas to display image
+        self.image_canvas = tk.Canvas(root, width=400, height=400, bg="white")
+        self.image_canvas.pack(pady=10)
+        # Label to display prediction
+        self.prediction_label = tk.Label(root, text="", font=("Arial", 14))
+        self.prediction_label.pack(pady=10)
+    def load_image(self):
+        file_path = filedialog.askopenfilename(filetypes=[("Image files", "*.png *.jpg *.jpeg")])
+        if file_path:
+            try:
+                # Open and convert image to RGB
+                img = Image.open(file_path).convert("RGB")
+                img_resized = img.resize((400, 400))  # For display
+                self.photo_img = ImageTk.PhotoImage(img_resized)
+                self.image_canvas.create_image(200, 200, image=self.photo_img)
+                # Preprocess with explicit settings
+                inputs = processor(
+                    images=img,
+                    return_tensors="pt",
+                    do_resize=True,
+                    size={"height": 224, "width": 224},  # Adjust based on model's expected size (common for SigLIP)
+                    do_normalize=True
+                ).to(device)
+                # Inference
+                with torch.no_grad():
+                    outputs = model(**inputs)
+                    logits = outputs.logits
+                    probabilities = torch.softmax(logits, dim=-1)
+                    max_prob, predicted_id = probabilities.max(dim=-1)
+                    predicted_label = id2label[predicted_id.item()]
+                    # Heuristic: If confidence is low, classify as "Not an Embryo"
+                    confidence_threshold = 0.45# Adjust this threshold as needed
+                    if max_prob.item() < confidence_threshold:
+                        predicted_label = "Not an Embryo"
+                # Display prediction
+                self.prediction_label.config(text=f"Predicted Grade: {predicted_label} (Confidence: {max_prob.item():.2f})")
+            except Exception as e:
+                messagebox.showerror("Error", f"Failed to process image: {str(e)}")
+if __name__ == "__main__":
+    root = tk.Tk()
+    app = ImageClassifierApp(root)
+    root.mainloop()