Instructions to use Luni/StarDust-12b-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Luni/StarDust-12b-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Luni/StarDust-12b-v2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Luni/StarDust-12b-v2") model = AutoModelForCausalLM.from_pretrained("Luni/StarDust-12b-v2") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Luni/StarDust-12b-v2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Luni/StarDust-12b-v2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Luni/StarDust-12b-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Luni/StarDust-12b-v2
- SGLang
How to use Luni/StarDust-12b-v2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Luni/StarDust-12b-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Luni/StarDust-12b-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Luni/StarDust-12b-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Luni/StarDust-12b-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Luni/StarDust-12b-v2 with Docker Model Runner:
docker model run hf.co/Luni/StarDust-12b-v2
'Shivers down his spin'
His... spin. I assume spine, but it consistently refuses to spell the full word out in this case. There are no prior cases of 'spin' in context, either (no prior cases of 'spine' either, though). Otherwise the settings I have produce perfectly coherent output, and there aren't any cases of 'spin' or 'spine' in context to bias things.
Hi, thank you for the feedback.
The models this merge is based off has a severe case of what we call "claudeism" which is inspired ny Anthropic Claude's 'writingstyle' which has an obsession with such terms.
There is no direct 'solution' to this as the trainingdata which has been used to enhance e.g. Magnum's prose is EXTREMELY claude influence and has adapted both its prose and choice of words.
Well, it's not that it's outputting 'shivers down his spine'; I'm well aware that's a claudism.
It's that it's consistently going 'spin' instead of 'spine'; for some reason the tokens that lead up to and comprise 'shivers down his' resolve with 'spin' instead of 'spine'.
Sorry! I was in a rush so i might've coughed up my reply a bit too hasty.
But what you pointed out to me seems odd, it sounds like there might be a hiccup with the tokenizer there, i'll investigate the issue! Thank you!