keras
/

sam3_pcs

KerasHub

Model card Files Files and versions

xet

Community

prasadsachin commited on Feb 26

Commit

6fc21d5

verified ·

1 Parent(s): 7b932f9

Update README.md with new model card content

Browse files

Files changed (1) hide show

README.md +86 -13

README.md CHANGED Viewed

@@ -1,16 +1,89 @@
 ---
 library_name: keras-hub
 ---
-This is a [`SAM3PromptableConcept` model](https://keras.io/api/keras_hub/models/sam3_promptable_concept) uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
-Model config:
-* **name:** sam3_promptable_concept_backbone
-* **trainable:** True
-* **dtype:** {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}
-* **vision_encoder:** {'module': 'keras_hub.src.models.sam3.sam3_vision_encoder', 'class_name': 'SAM3VisionEncoder', 'config': {'name': 'sam3_vision_encoder', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'image_shape': [1008, 1008, 3], 'patch_size': 14, 'num_layers': 32, 'hidden_dim': 1024, 'intermediate_dim': 4736, 'num_heads': 16, 'fpn_hidden_dim': 256, 'fpn_scale_factors': [4.0, 2.0, 1.0, 0.5], 'pretrain_image_shape': [336, 336, 3], 'hidden_activation': 'gelu', 'rope_theta': 10000.0, 'window_size': 24, 'global_attn_indexes': [7, 15, 23, 31], 'attention_dropout_rate': 0.0, 'hidden_dropout_rate': 0.0, 'layer_norm_epsilon': 1e-06}, 'registered_name': 'keras_hub>SAM3VisionEncoder'}
-* **text_encoder:** {'module': 'keras_hub.src.models.sam3.sam3_text_encoder', 'class_name': 'SAM3TextEncoder', 'config': {'name': 'sam3_text_encoder', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'vocabulary_size': 49408, 'embedding_dim': 1024, 'hidden_dim': 1024, 'num_layers': 24, 'num_heads': 16, 'intermediate_dim': 4096, 'intermediate_activation': 'gelu', 'max_sequence_length': 32, 'layer_norm_epsilon': 1e-05}, 'registered_name': 'keras_hub>SAM3TextEncoder'}
-* **geometry_encoder:** {'module': 'keras_hub.src.models.sam3.sam3_geometry_encoder', 'class_name': 'SAM3GeometryEncoder', 'config': {'name': 'sam3_geometry_encoder', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'num_layers': 3, 'hidden_dim': 256, 'intermediate_dim': 2048, 'num_heads': 8, 'roi_size': 7, 'hidden_activation': 'relu', 'dropout_rate': 0.0, 'layer_norm_epsilon': 1e-06}, 'registered_name': 'keras_hub>SAM3GeometryEncoder'}
-* **detr_encoder:** {'module': 'keras_hub.src.models.sam3.sam3_detr_encoder', 'class_name': 'SAM3DetrEncoder', 'config': {'name': 'sam3_detr_encoder', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'num_layers': 6, 'hidden_dim': 256, 'intermediate_dim': 2048, 'num_heads': 8, 'hidden_activation': 'relu', 'dropout_rate': 0.1, 'layer_norm_epsilon': 1e-06}, 'registered_name': 'keras_hub>SAM3DetrEncoder'}
-* **detr_decoder:** {'module': 'keras_hub.src.models.sam3.sam3_detr_decoder', 'class_name': 'SAM3DetrDecoder', 'config': {'name': 'sam3_detr_decoder', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'image_shape': [1008, 1008, 3], 'patch_size': 14, 'num_layers': 6, 'hidden_dim': 256, 'intermediate_dim': 2048, 'num_heads': 8, 'num_queries': 200, 'hidden_activation': 'relu', 'dropout_rate': 0.1, 'layer_norm_epsilon': 1e-06}, 'registered_name': 'keras_hub>SAM3DetrDecoder'}
-* **mask_decoder:** {'module': 'keras_hub.src.models.sam3.sam3_mask_decoder', 'class_name': 'SAM3MaskDecoder', 'config': {'name': 'sam3_mask_decoder', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'num_upsampling_stages': 3, 'hidden_dim': 256, 'num_heads': 8, 'dropout_rate': 0.0, 'layer_norm_epsilon': 1e-06}, 'registered_name': 'keras_hub>SAM3MaskDecoder'}
-This model card has been generated automatically and should be completed by the model author. See [Model Cards documentation](https://huggingface.co/docs/hub/model-cards) for more information.

 ---
 library_name: keras-hub
 ---
+### Model Overview
+# SAM 3
+The Segment Anything Model 3 (SAM 3) is a high-performance foundation model for promptable object segmentation in images. Building upon the breakthroughs of previous SAM iterations, SAM 3 is designed for real-time performance, superior mask quality, and improved zero-shot generalization across diverse visual domains.
+## Model Summary
+SAM 3 follows the "Segment Anything" philosophy by providing a universal interface for segmentation via prompts such as points, bounding boxes, or previous masks. It features a decoupled architecture that separates heavy image encoding from lightweight prompt processing, allowing the model to generate masks in near real-time once an image embedding has been computed.
+SAM3 promptable concept segmentation (PCS) segments objects in images based
+on concept prompts, which could be short noun phrases (e.g., “yellow school bus”), image exemplars, or a combination of both. SAM3 PCS takes such prompts and returns segmentation masks and unique identities for all matching object instances.
+ There are two ways to prompt:
+    1. Text prompt: A short noun phrase describing the concept to segment.
+    2. Box prompt: A box tells the model which part/crop of the image to
+        segment.
+ These prompts can be used individually or together, but at least one of the prompts must be present. To turn off a particular prompt, simply exclude it from the inputs to the model.
+This modular design allows users to swap backbones of varying sizes (Tiny, Small, Base, Large) depending on the hardware constraints and accuracy requirements.
+## References
+* [SAM 3 Quickstart Notebook](Coming Soon)
+* [SAM 3 API Documentation](https://keras.io/keras_hub/api/models/sam3/)
+* [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/)
+* [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/)
+* [Segment Anything 3 Technical Report](https://huggingface.co/facebook/sam3)
+## Installation
+Keras and KerasHub can be installed using the following commands:
+```bash
+pip install -U -q keras-hub
+pip install -U -q keras
+```
+The following table summarizes the different configurations available for SAM 3 in Keras Hub:
+| Preset                       | Parameters | Description                                                                                      |
+| ---------------------------- | ---------- | ------------------------------------------------------------------------------------------------ |
+| `sam3_pcs` | ~30M | Promptable Concept Segmentation (PCS) SAM model|
+## Example Usage
+```python
+image_size = 128
+batch_size = 2
+input_data = {
+    "images": np.ones(
+        (batch_size, image_size, image_size, 3), dtype="float32",
+    ),
+    "prompts": ["ear", "head"],
+    "boxes": np.ones((batch_size, 1, 4), dtype="float32"),  # XYXY format.
+    "box_labels": np.ones((batch_size, 1), dtype="float32"),
+}
+sam3_pcs = keras_hub.models.SAM3PromptableConceptImageSegmenter.from_preset(
+    "sam3_pcs"
+)
+outputs = sam3_pcs.predict(input_data)
+scores = outputs["scores"]  # [B, num_queries]
+boxes = outputs["boxes"]  # [B, num_queries, 4]
+masks = outputs["masks"]  # [B, num_queries, H, W]
+```
+## Example Usage with Hugging Face URI
+```python
+image_size = 128
+batch_size = 2
+input_data = {
+    "images": np.ones(
+        (batch_size, image_size, image_size, 3), dtype="float32",
+    ),
+    "prompts": ["ear", "head"],
+    "boxes": np.ones((batch_size, 1, 4), dtype="float32"),  # XYXY format.
+    "box_labels": np.ones((batch_size, 1), dtype="float32"),
+}
+hf://keras/sam3_pcs = keras_hub.models.SAM3PromptableConceptImageSegmenter.from_preset(
+    "hf://keras/sam3_pcs"
+)
+outputs = hf://keras/sam3_pcs.predict(input_data)
+scores = outputs["scores"]  # [B, num_queries]
+boxes = outputs["boxes"]  # [B, num_queries, 4]
+masks = outputs["masks"]  # [B, num_queries, H, W]
+```