CocoBro
/

Foley-Omni

CocoBro commited on 6 days ago

Commit

1ebd877

verified ·

1 Parent(s): f0b101f

Add files using upload-large-folder tool

Files changed (2) hide show

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ The main model checkpoint in this release is an inference-only export from:
 ```text
 ckpts/
 ├── Foley-Omni/
-│   └── model_checkpoint.pth
 ├── Wan2.2-TI2V-5B/
 │   ├── models_t5_umt5-xxl-enc-bf16.pth
 │   └── google/
@@ -42,7 +42,7 @@ ckpts/
 What each part is used for:
-- `ckpts/Foley-Omni/model_checkpoint.pth`: released inference-only Foley-Omni weights
 - `ckpts/Wan2.2-TI2V-5B/*`: text encoder and tokenizer for text conditioning
 - `ckpts/mmaudio/ext_weights/v1-16.pth`: audio VAE for the 16 kHz inference path
 - `ckpts/mmaudio/ext_weights/best_netG.pt`: vocoder for waveform decoding
@@ -68,4 +68,28 @@ This repository redistributes a small subset of files from the following upstrea
 - **MMAudio**: audio VAE, vocoder, and Synchformer files
 Please refer to the original upstream repositories for their licenses, usage terms, and project details.
 ```

 ```text
 ckpts/
 ├── Foley-Omni/
+│   └── v2st.pth
 ├── Wan2.2-TI2V-5B/
 │   ├── models_t5_umt5-xxl-enc-bf16.pth
 │   └── google/
 What each part is used for:
+- `ckpts/Foley-Omni/v2st.pth`: released inference-only Foley-Omni weights
 - `ckpts/Wan2.2-TI2V-5B/*`: text encoder and tokenizer for text conditioning
 - `ckpts/mmaudio/ext_weights/v1-16.pth`: audio VAE for the 16 kHz inference path
 - `ckpts/mmaudio/ext_weights/best_netG.pt`: vocoder for waveform decoding
 - **MMAudio**: audio VAE, vocoder, and Synchformer files
 Please refer to the original upstream repositories for their licenses, usage terms, and project details.
+## Quick Start
+Use the code repository for inference scripts, configs, examples, and feature extraction tools:
+- `inference_v2st.py`
+- `inference_v2st.yaml`
+- `examples/video_text_example.json`
+- `data_process/convert_memmap_to_npy.py`
+Download the packaged checkpoints with:
+```bash
+hf download CocoBro/Foley-Omni \
+  ckpts/Foley-Omni/v2st.pth \
+  ckpts/Wan2.2-TI2V-5B/models_t5_umt5-xxl-enc-bf16.pth \
+  ckpts/Wan2.2-TI2V-5B/google/umt5-xxl/special_tokens_map.json \
+  ckpts/Wan2.2-TI2V-5B/google/umt5-xxl/spiece.model \
+  ckpts/Wan2.2-TI2V-5B/google/umt5-xxl/tokenizer.json \
+  ckpts/Wan2.2-TI2V-5B/google/umt5-xxl/tokenizer_config.json \
+  ckpts/mmaudio/ext_weights/v1-16.pth \
+  ckpts/mmaudio/ext_weights/best_netG.pt \
+  ckpts/mmaudio/ext_weights/synchformer_state_dict.pth \
+  --local-dir .
 ```

ckpts/Foley-Omni/v2st.pth ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:8dfcebc33b4848b3639cea815000c8b2c9e02de2ffc655a763c03f3e4232d941
+size 22214978751