findcard12138 commited on
Commit
32d2053
·
verified ·
1 Parent(s): ad70ed1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +9 -15
README.md CHANGED
@@ -44,19 +44,12 @@ For architecture diagrams and full system details, see the top-level repository:
44
 
45
  ## 🚀 Quickstart
46
 
47
-
48
- ### Offline video inference (works with base/SFT checkpoints)
49
-
50
- Use this to sanity-check **loading**, **video ingestion**, and **end-to-end generation**.
51
-
52
- #### Video inference (Python, recommended)
53
 
54
  ```python
55
  import torch
56
  from transformers import AutoModelForCausalLM, AutoProcessor
57
-
58
- # Use local path like: "models/moss-video-preview-base"
59
- # Or use Hugging Face model id like: "fnlp-vision/moss-video-preview-base"
60
  checkpoint = "fnlp-vision/moss-video-preview-base"
61
  video_path = "data/example_video.mp4"
62
  prompt = "" # For base model, prompt is set to empty to perform completion task.
@@ -99,20 +92,20 @@ with torch.no_grad():
99
  output_ids = model.generate(**inputs, max_new_tokens=512, do_sample=False)
100
 
101
  print(processor.decode(output_ids[0], skip_special_tokens=True))
102
- # Tip: set skip_special_tokens=False only when debugging special tokens / chat template formatting.
103
  ```
104
 
105
 
106
 
107
- #### Image inference (Python)
 
 
 
108
 
109
  ```python
110
  import torch
111
  from PIL import Image
112
  from transformers import AutoModelForCausalLM, AutoProcessor
113
-
114
- # Use local path like: "models/moss-video-preview-base"
115
- # Or use Hugging Face model id like: "fnlp-vision/moss-video-preview-base"
116
  checkpoint = "fnlp-vision/moss-video-preview-base"
117
  image_path = "data/example_image.jpg"
118
  prompt = "" # For base model, prompt is set to empty to perform completion task.
@@ -153,9 +146,10 @@ with torch.no_grad():
153
  output_ids = model.generate(**inputs, max_new_tokens=256, do_sample=False)
154
 
155
  print(processor.decode(output_ids[0], skip_special_tokens=True))
156
- # Tip: set skip_special_tokens=False only when debugging special tokens / chat template formatting.
157
  ```
158
 
 
 
159
  ## ✅ Intended use
160
 
161
  - **Foundation checkpoint**: continue pretraining, run domain adaptation, or perform supervised fine-tuning (offline SFT / realtime SFT).
 
44
 
45
  ## 🚀 Quickstart
46
 
47
+ <details>
48
+ <summary><strong>Video inference</strong></summary>
 
 
 
 
49
 
50
  ```python
51
  import torch
52
  from transformers import AutoModelForCausalLM, AutoProcessor
 
 
 
53
  checkpoint = "fnlp-vision/moss-video-preview-base"
54
  video_path = "data/example_video.mp4"
55
  prompt = "" # For base model, prompt is set to empty to perform completion task.
 
92
  output_ids = model.generate(**inputs, max_new_tokens=512, do_sample=False)
93
 
94
  print(processor.decode(output_ids[0], skip_special_tokens=True))
95
+
96
  ```
97
 
98
 
99
 
100
+ </details>
101
+
102
+ <details>
103
+ <summary><strong>Image inference</strong></summary>
104
 
105
  ```python
106
  import torch
107
  from PIL import Image
108
  from transformers import AutoModelForCausalLM, AutoProcessor
 
 
 
109
  checkpoint = "fnlp-vision/moss-video-preview-base"
110
  image_path = "data/example_image.jpg"
111
  prompt = "" # For base model, prompt is set to empty to perform completion task.
 
146
  output_ids = model.generate(**inputs, max_new_tokens=256, do_sample=False)
147
 
148
  print(processor.decode(output_ids[0], skip_special_tokens=True))
 
149
  ```
150
 
151
+ </details>
152
+
153
  ## ✅ Intended use
154
 
155
  - **Foundation checkpoint**: continue pretraining, run domain adaptation, or perform supervised fine-tuning (offline SFT / realtime SFT).