AILab-CVC
/

SEED

Model card Files Files and versions

xet

Community

tttoaster commited on Oct 19, 2023

Commit

255addb

1 Parent(s): 57f5d82

Update README.md

Browse files

Files changed (1) hide show

README.md +75 -0

README.md CHANGED Viewed

@@ -1,3 +1,78 @@
 ---
 license: llama2
 ---

 ---
 license: llama2
 ---
+# SEED Multimodal
+[Project Homepage](https://ailab-cvc.github.io/seed/)
+**Powered by [CV Center, Tencent AI Lab](https://ailab-cvc.github.io), and [ARC Lab, Tencent PCG](https://github.com/TencentARC).**
+## Usage
+### Dependencies
+- Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux))
+- [PyTorch >= 1.11.0](https://pytorch.org/)
+- NVIDIA GPU + [CUDA](https://developer.nvidia.com/cuda-downloads)
+### Installation
+1. Clone repo
+    ```bash
+    git clone https://github.com/AILab-CVC/SEED.git
+    cd SEED
+    ```
+2. Install dependent packages
+    ```bash
+    pip install -r requirements.txt
+    ```
+### Model Weights
+We provide the pretrained SEED Tokenizer and De-Tokenizer, instruction tuned SEED-LLaMA-8B and SEED-LLaMA-14B.
+Please download the checkpoints and save under the folder `./pretrained`.
+To reconstruct the image from the SEED visual codes using unCLIP SD-UNet, please download the pretrained [unCLIP SD](https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip).
+Rename the checkpoint directory to **"diffusion_model"** and create a soft link to the "pretrained/seed_tokenizer" directory.
+### Inference for visual tokenization and de-tokenization
+To discretize an image to 1D visual codes with causal dependency, and reconstruct the image from the visual codes using the off-the-shelf unCLIP SD-UNet:
+```bash
+python scripts/seed_tokenizer_inference.py
+```
+### Launching Demo of SEED-LLaMA Locally
+```bash
+sh start_backend.sh
+sh start_frontend.sh
+```
+## Citation
+If you find the work helpful, please consider citing:
+```bash
+@article{ge2023making,
+  title={Making LLaMA SEE and Draw with SEED Tokenizer},
+  author={Ge, Yuying and Zhao, Sijie and Zeng, Ziyun and Ge, Yixiao and Li, Chen and Wang, Xintao and Shan, Ying},
+  journal={arXiv preprint arXiv:2310.01218},
+  year={2023}
+}
+@article{ge2023planting,
+  title={Planting a seed of vision in large language model},
+  author={Ge, Yuying and Ge, Yixiao and Zeng, Ziyun and Wang, Xintao and Shan, Ying},
+  journal={arXiv preprint arXiv:2307.08041},
+  year={2023}
+}
+```
+The project is still in progress. Stay tuned for more updates!
+## License
+`SEED` is released under [Apache License Version 2.0](License.txt).
+`SEED-LLaMA` is released under the original [License](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) of [LLaMA2](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf).
+## Acknowledgement
+We thank the great work from [unCLIP SD](https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip) and [BLIP2](https://github.com/salesforce/LAVIS).