PixDLM / docs /MODEL.md
WhynotHug's picture
Upload folder using huggingface_hub
3334467 verified
|
Raw
History Blame Contribute Delete
1.1 kB
# Model Assets
PixDLM uses the following components:
| Asset | Default local path | Source |
| --- | --- | --- |
| PixDLM checkpoint | `pretrained/pixdlm-7b` | `WhynotHug/PixDLM` |
| CLIP vision tower | `checkpoints/clip-vit-large-patch14` | `openai/clip-vit-large-patch14` |
| LLaVA/Vicuna base | `checkpoints/llava-v1.6-vicuna-7b` | LLaVA/Vicuna upstream |
| SAM2 checkpoint | `checkpoints/sam2_checkpoints/sam2.1_hiera_large.pt` | SAM2 upstream |
The release scripts do not assume private filesystem locations. Pass paths
explicitly through command-line arguments or use the default relative layout.
## Weight Loading
Evaluation uses:
```bash
--version pretrained/pixdlm-7b
--vision-tower checkpoints/clip-vit-large-patch14
```
The `pretrained/pixdlm-7b` directory is a model checkpoint directory, not the
project root. It contains HuggingFace config/tokenizer files and the downloaded
or merged PixDLM weights.
Training from the base LLaVA/Vicuna model uses:
```bash
--version checkpoints/llava-v1.6-vicuna-7b
```
Follow the upstream licenses for LLaVA, Vicuna/LLaMA, CLIP, and SAM2.