To reverse engineer the model, you need to download it to: runtime/models/MiniCPM-V-4_5 https://modelscope.ai/models/OpenBMB/MiniCPM-V-4_5/files
You need to extract caption_python.7z to the runtime directory. This is the Python environment. Due to the large number of subfiles, it can only be uploaded as a compressed package. If you don't want to download it or feel it's risky, you can download Codex and have it download a new environment for you.
Detailed tutorial: https://youtu.be/h27Sedb_v08
Features:
Edits videos to the frame rate/resolution needed for training.
Includes cropping functionality to remove unwanted subtitles/black borders.
Offers faster frame range selection.
Records timeline, cropping box, and cue words for each data point, allowing for easy secondary editing without the need for manual adjustments and proofreading required by traditional editing tools.
Includes a cue word derivation function. Requires 16GB of VRAM for local operation. Low VRAM mode can be enabled in the settings if VRAM is insufficient.
English language can be enabled in the settings.
Supports batch conversion of frame rate/resolution for existing datasets.
