Instructions to use Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview") model = AutoModelForCausalLM.from_pretrained("Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview
- SGLang
How to use Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview", max_seq_length=2048, ) - Docker Model Runner
How to use Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview with Docker Model Runner:
docker model run hf.co/Pinkstackorg/PinkQwen2.5-3B-1M-DPO-preview
This model is purely for experimental purposes. Fine tuned on finetome, pinkchat-sft, pinkchat-dpo, the model is able to generate text which makes sense.
Additional fine-tuning is needed.
The model does not perform well, yet it does work. It has been fine tuned on 2 billion tokens of mostly syntetic data and some human made data in the sft process.
Phase 0: In mergekit, we remove 16 layers (out of 28, so 12 layers left: Pinkstackorg/Qwen2.5-3Bprunebase-1M) using passthrough.
Phase 1a: Fine tuning the model on a limited amount of data, lora 16 (21% trained). This phase is to get the model started on generating some sense, mainly for healing the model and nothing else, very low quality text would be generated.
Phase 1b: Fine tuning the model on a bigger amount of data, lora of 64(2.75% trained, due to removing lm_head, embed_tokens from the target_modules.) and a high sequence length on the same dataset (finetome) as phase 1a, would make the model much better at all tasks, but the model is still not able to generate proper high quality text, better than 1a though.
Phase 2: Fine tuning the model on a special dataset with synthetic generations, human text, code generations, math generations, some qwq generations for advanced reasoning. phase 2 makes the model be able to generate higher quality text, but has some issues, we use a low sequence length for only knowledge distillation, thus the model falls into loops sometimes when trying to generate long text. it is useable.
Phase 3: DPO on our Pinkstack/Pinkchat-dpo-19k-en dataset, on a higher sequence length, this phase is highly important, it makes the model safer, have better alignment and follow prompts better. the model has better performance and loops less but is still not great.
Phase 3 was done inside of google colab, other phases were run locally.
Uploaded model
- Developed by: Pinkstack
- License: apache-2.0
- Finetuned from model : Pinkstack/qwen2.5-3b-1m-sft-phase2-max96-lowloss
This qwen2 model was trained with Unsloth and Huggingface's TRL library.
- Downloads last month
- 8