Instructions to use MILVLG/imp-v1-3b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MILVLG/imp-v1-3b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MILVLG/imp-v1-3b", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("MILVLG/imp-v1-3b", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use MILVLG/imp-v1-3b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MILVLG/imp-v1-3b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MILVLG/imp-v1-3b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/MILVLG/imp-v1-3b
- SGLang
How to use MILVLG/imp-v1-3b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MILVLG/imp-v1-3b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MILVLG/imp-v1-3b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MILVLG/imp-v1-3b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MILVLG/imp-v1-3b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use MILVLG/imp-v1-3b with Docker Model Runner:
docker model run hf.co/MILVLG/imp-v1-3b
Fine tuning code?
When can we expect model fine tuning code to be added to github?
Expect to be released in Feb.
Great to hear that! Thank you for the update.
I'm trying to finetune this model, Please share your pretrained checkpoints.
'./checkpoints/imp-v1-3b-pretrain/mm_projector.bin'
- fine tune code on google colab
- qlora
- peft
?
example colab fine tune:
https://colab.research.google.com/drive/1Rg44ZVPf3_cs77UUXmOp_eMzGLoQtml0?usp=sharing
I have been doing some fine tune tests but I can't get it right, since I see that it is a little different or I don't know if I should place the 'tokenizer' in another way or add everything in one with the text, does anyone have another example than this working.
I don't know if this is correct:
image_tensor = model.image_preprocess(item["image"]) # current example
The code works to some extent but it doesn't cause the 'loss' to be calculated, the reverse does not exist in the loss.
I would appreciate the help.
thank you so much
I have not been able to fine tune this model, do you have any ideas?
Hi, @NickyNicky @Inje we have release the finetuning code on our github page. Currently, only the traditional fully ft or LoRA finetune are supported. If you have some inspiring results with QLoRA, please let us know.