moxin-org
/

Moxin-7B-VLM

Model card Files Files and versions

Moxin-7B-VLM / README.md

hzeng412's picture

Update README.md (#1)

fa3a06c verified 9 months ago

|

history blame contribute delete

2.54 kB

	---
	license: mit
	---


	<h1 align="center"> Moxin 7B VLM </h1>

	<p align="center"> <a href="https://github.com/moxin-org/Moxin-VLM">Home Page</a> &nbsp&nbsp \| &nbsp&nbsp <a href="https://arxiv.org/abs/2412.06845">Technical Report</a> &nbsp&nbsp \| &nbsp&nbsp <a href="https://huggingface.co/moxin-org/Moxin-7B-LLM">Base Model</a> &nbsp&nbsp \| &nbsp&nbsp <a href="https://huggingface.co/moxin-org/Moxin-7B-Chat">Chat Model</a> &nbsp&nbsp \| &nbsp&nbsp <a href="https://huggingface.co/moxin-org/Moxin-7B-Instruct">Instruct Model</a> &nbsp&nbsp \| &nbsp&nbsp <a href="https://huggingface.co/moxin-org/Moxin-7B-Reasoning">Reasoning Model</a> &nbsp&nbsp \| &nbsp&nbsp <a href="https://huggingface.co/moxin-org/Moxin-7B-VLM">VLM Model</a> </p>

	---

	## Installation

	```bash
	git clone https://github.com/moxin-org/Moxin-VLM.git
	cd Moxin-VLM
	conda create -n moxin-vlm python=3.10 -y
	conda activate moxin-vlm
	pip install torch==2.4.1 torchvision==0.19.1
	pip install transformers==4.46.0 peft==0.15.2
	pip install -e .
	# Install Flash Attention 2
	# =>> If you run into difficulty, try `pip cache remove flash_attn` first
	pip install flash-attn==2.6.3 --no-build-isolation
	```

	## Pretrained Models

	Please find our Pretrained Models on our huggingface page: [moxin-org/Moxin-7B-VLM](https://huggingface.co/moxin-org/Moxin-7B-VLM).

	We've also provided a hf_convert version [Moxin-7B-VLM-hf](https://huggingface.co/bobchenyx/Moxin-7B-VLM-hf) based on [openvla](https://github.com/openvla/openvla).
	Please refer to the attached scripts for downloading and running our model locally.
	```bash
	python scripts/snapshot_download.py
	```
	## Usage
	For a complete terminal-based CLI for interacting with our VLMs.
	```bash
	python scripts/generate.py --model_path moxin-org/Moxin-7B-VLM
	```
	For a faster loading, inference and demo.
	```bash
	python scripts/fast_inference.py

	```
	---
	## Acknowledgments
	This project is based on [Prismatic VLMs](https://github.com/TRI-ML/prismatic-vlms) by [TRI-ML](https://github.com/TRI-ML).
	Special thanks to the original contributors for their excellent work.
	## Citation
	If you find our code or models useful in your work, please cite [our paper](https://arxiv.org/abs/2412.06845v5):
	```bibtex
	@article{zhao2024fully,
	title={Fully Open Source Moxin-7B Technical Report},
	author={Zhao, Pu and Shen, Xuan and Kong, Zhenglun and Shen, Yixin and Chang, Sung-En and Rupprecht, Timothy and Lu, Lei and Nan, Enfu and Yang, Changdi and He, Yumei and others},
	journal={arXiv preprint arXiv:2412.06845},
	year={2024}
	}