nvidia
/

MambaVision-T-1K

Image Classification

image-feature-extraction

Model card Files Files and versions

ahatamiz commited on Jul 23, 2024

Commit

60ce7c6

·

verified ·

1 Parent(s): ecc951d

Update README.md

Files changed (1) hide show

README.md +57 -1

README.md CHANGED Viewed

@@ -34,13 +34,69 @@ You must first login into HuggingFace to pull the model:
 huggingface-cli login
 ```
-The model can be simply used according to:
 ```Python
 from transformers import AutoModelForImageClassification
 model = AutoModelForImageClassification.from_pretrained("nvidia/MambaVision-T-1K", trust_remote_code=True)
 ```
 ### License:

 huggingface-cli login
 ```
+It is highly recommended to install the requirements for MambaVision by running the following:
+```Bash
+pip install mambavision
+```
+For each model, we offer two variants for image classification and feature extraction that can be imported with 1 line of code.
+The model can be simply imported according to:
 ```Python
 from transformers import AutoModelForImageClassification
 model = AutoModelForImageClassification.from_pretrained("nvidia/MambaVision-T-1K", trust_remote_code=True)
 ```
+The model outputs logits when an image is passed. If label is additionally provided, cross entropy loss between the output prediction and label is computed.
+The following demonstrates a minimal example of how to use the model:
+```Python
+from transformers import AutoModelForImageClassification
+from PIL import Image
+import requests
+import torch
+# import mambavision model
+model = AutoModelForImageClassification.from_pretrained("nvidia/MambaVision-T-1K", trust_remote_code=True)
+# eval mode for inference
+model.eval()
+# prepare image for the model
+url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
+image = Image.open(requests.get(url, stream=True).raw)
+# define a transform
+transforms = timm.data.create_transform((3, 224, 224))
+image = transforms(image).unsqueeze(0)
+# put both model and image on cuda
+model = model.cuda()
+image = image.cuda()
+# forward pass
+outputs = model(image)
+# You can then extract the predicted probabilities by applying softmax:
+probabilities = torch.nn.functional.softmax(outputs['logits'], dim=0)
+# In order to find the top 5 predicted class indexes and their corresponding values:
+values, indices = torch.topk(probabilities, 5)
+```
 ### License: