Memories-ai
/

security_model

Safetensors

English

qwen2_5_vl

Model card Files Files and versions

xet

Community

Jerrick777 commited on 17 days ago

Commit

f4b953a

verified ·

1 Parent(s): 3a9e490

Update README.md

Browse files

Files changed (1) hide show

README.md +85 -0

README.md CHANGED Viewed

@@ -20,6 +20,91 @@ Memories-S0 is designed to address two key challenges in security video understa
 * **Extreme Efficiency:** It utilizes an innovative input token compression algorithm that dynamically prunes redundant background tokens, focusing computation on foreground objects and motion. This allows the 3B model to run efficiently on mobile/edge hardware.
 * **Post-Training:** The model employs a unique post-training strategy using Reinforcement Learning (RL) and event-based temporal shuffling to enhance sequential understanding without expensive full fine-tuning.
 ## Intended Use
 ### Primary Use Cases

 * **Extreme Efficiency:** It utilizes an innovative input token compression algorithm that dynamically prunes redundant background tokens, focusing computation on foreground objects and motion. This allows the 3B model to run efficiently on mobile/edge hardware.
 * **Post-Training:** The model employs a unique post-training strategy using Reinforcement Learning (RL) and event-based temporal shuffling to enhance sequential understanding without expensive full fine-tuning.
+## Installation
+```bash
+conda create -n memories-s0 python=3.10 -y
+conda activate memories-s0
+# Install PyTorch with CUDA support
+pip install torch torchvision torchaudio --index-url <https://download.pytorch.org/whl/cu121>
+# Install dependencies for Qwen2.5-VL architecture and Flash Attention
+pip install transformers>=4.37.0 accelerate qwen_vl_utils
+pip install flash-attn --no-build-isolation
+```
+## Inference
+The following script demonstrates how to run the **Memories-S0** model. It automatically handles the loading of weights from the official Hugging Face repository.
+```python
+import torch
+import argparse
+from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
+from qwen_vl_utils import process_vision_info
+# Official Model Repository
+MODEL_ID = "Memories-ai/security_model"
+def run_inference(video_path, model_id=MODEL_ID):
+    # Load Model with Flash Attention 2 for efficiency
+    model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
+        model_id,
+        torch_dtype=torch.bfloat16,
+        attn_implementation="flash_attention_2",
+        device_map="auto",
+    )
+    processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
+    # Define Security Analysis Prompt
+    prompt_text = """YOUR_PROMPT"""
+    messages = [
+        {
+            "role": "user",
+            "content": [
+                {"type": "video", "video": video_path},
+                {"type": "text", "text": prompt_text},
+            ],
+        }
+    ]
+    # Preprocessing
+    text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+    image_inputs, video_inputs, video_kwargs = process_vision_info(messages, return_video_kwargs=True)
+    inputs = processor(
+        text=[text],
+        images=image_inputs,
+        videos=video_inputs,
+        padding=True,
+        return_tensors="pt",
+        **video_kwargs,
+    )
+    inputs = inputs.to("cuda")
+    # Generate
+    generated_ids = model.generate(**inputs, max_new_tokens=768)
+    generated_ids_trimmed = [
+        out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
+    ]
+    output_text = processor.batch_decode(
+        generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
+    )
+    print(output_text[0])
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--video_path", type=str, required=True, help="Path to input video")
+    args = parser.parse_args()
+    run_inference(args.video_path)
+```
 ## Intended Use
 ### Primary Use Cases