chukewang commited on
Commit
4511f87
·
1 Parent(s): 3901199

Init: add images via LFS

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -15,7 +15,7 @@ metrics:
15
  - accuracy
16
  ---
17
 
18
- ## 🚀🚀TimeAudio: Bridging Temporal Gaps in Large Audio-Language
19
 
20
  <div style='display:flex; gap: 0.25rem; '>
21
  <a href='https://arxiv.org/pdf/.pdf'><img src='https://img.shields.io/badge/paper-PDF-green'></a>
@@ -47,8 +47,8 @@ You need to use the following dependencies:
47
  2. Download [whisper large v2](https://huggingface.co/openai/whisper-large-v2/tree/main) to ```whisper_path```.
48
  3. Download [Fine-tuned BEATs_iter3+ (AS2M) (cpt2)](https://valle.blob.core.windows.net/share/BEATs/BEATs_iter3_plus_AS2M_finetuned_on_AS2M_cpt2.pt?sv=2020-08-04&st=2023-03-01T07%3A51%3A05Z&se=2033-03-02T07%3A51%3A00Z&sr=c&sp=rl&sig=QJXmSJG9DbMKf48UDIU1MfzIro8HQOf3sqlNXiflY1I%3D) to `beats_path`.
49
  4. Download [vicuna 7B v1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5/tree/main) to ```vicuna_path```.
50
- 5. Download [salmonn-7b v0](https://huggingface.co/tsinghua-ee/SALMONN-7B/blob/main/salmonn_7b_v0.pth) to ```ckpt_path```.
51
- 6. Running with ```python3 cli_inference.py --ckpt_path xxx --whisper_path xxx --beats_path xxx --vicuna_path xxx``` to start cli inference. Please make sure your GPU has more than 40G of memory. If your GPU does not have enough memory (e.g. only 24G), you can quantize the model using the `--low_resource` parameter to reduce the memory usage, and can reduce the LoRA scaling factor to maintain the model's emergent abilities, e.g. `--lora_alpha=28`.
52
 
53
  ## Launch a QA
54
 
@@ -57,7 +57,7 @@ You need to use the following dependencies:
57
 
58
 
59
  ## Citation
60
- If you find SALMONN great and useful, please cite our paper:
61
  ```
62
  @article{,
63
  title={TimeAudio: Bridging Temporal Gaps in Large Audio-Language Models},
 
15
  - accuracy
16
  ---
17
 
18
+ ## 🚀🚀 TimeAudio: Bridging Temporal Gaps in Large Audio-Language
19
 
20
  <div style='display:flex; gap: 0.25rem; '>
21
  <a href='https://arxiv.org/pdf/.pdf'><img src='https://img.shields.io/badge/paper-PDF-green'></a>
 
47
  2. Download [whisper large v2](https://huggingface.co/openai/whisper-large-v2/tree/main) to ```whisper_path```.
48
  3. Download [Fine-tuned BEATs_iter3+ (AS2M) (cpt2)](https://valle.blob.core.windows.net/share/BEATs/BEATs_iter3_plus_AS2M_finetuned_on_AS2M_cpt2.pt?sv=2020-08-04&st=2023-03-01T07%3A51%3A05Z&se=2033-03-02T07%3A51%3A00Z&sr=c&sp=rl&sig=QJXmSJG9DbMKf48UDIU1MfzIro8HQOf3sqlNXiflY1I%3D) to `beats_path`.
49
  4. Download [vicuna 7B v1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5/tree/main) to ```vicuna_path```.
50
+ 5. Download [timeaudio](https://huggingface.co/lysanderism/TimeAudio/timeaudio.pth) to ```ckpt_path```.
51
+ 6. Running with ```python3 cli_inference.py --ckpt_path xxx --whisper_path xxx --beats_path xxx --vicuna_path xxx``` to start cli inference. Please make sure your GPU has more than 40G of memory. If your GPU does not have enough memory (e.g. only 24G), you can quantize the model using the `--low_resource` parameter to reduce the memory usage.
52
 
53
  ## Launch a QA
54
 
 
57
 
58
 
59
  ## Citation
60
+ If you find TimeAudio great and useful, please cite our paper:
61
  ```
62
  @article{,
63
  title={TimeAudio: Bridging Temporal Gaps in Large Audio-Language Models},