Instructions to use Vortex5/MS3.2-24B-Penumbra-Aether with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Vortex5/MS3.2-24B-Penumbra-Aether with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Vortex5/MS3.2-24B-Penumbra-Aether")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Vortex5/MS3.2-24B-Penumbra-Aether")
model = AutoModelForCausalLM.from_pretrained("Vortex5/MS3.2-24B-Penumbra-Aether")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Vortex5/MS3.2-24B-Penumbra-Aether with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Vortex5/MS3.2-24B-Penumbra-Aether"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Vortex5/MS3.2-24B-Penumbra-Aether",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Vortex5/MS3.2-24B-Penumbra-Aether

SGLang

How to use Vortex5/MS3.2-24B-Penumbra-Aether with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Vortex5/MS3.2-24B-Penumbra-Aether" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Vortex5/MS3.2-24B-Penumbra-Aether",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Vortex5/MS3.2-24B-Penumbra-Aether" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Vortex5/MS3.2-24B-Penumbra-Aether",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Vortex5/MS3.2-24B-Penumbra-Aether with Docker Model Runner:
```
docker model run hf.co/Vortex5/MS3.2-24B-Penumbra-Aether
```

any successes with using this model?

by case-ai - opened Jan 7

Discussion

case-ai

Jan 7

This model doesn't seem to output anything other than generic responses, regardless of input. I have fairly context-heavy roleplay prompts, ~4k-6k on average, and they're run through an API that's been setup to work with more or less every model syntax out there. Am I missing something? I'm using the mradermacher/MS3.2-24B-Penumbra-Aether-i1-GGUF quants.

Example outputs:

{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"-7\n\n## Mistral v7 for Tekken 7\n\nThis model is a fine-tuned version of mistralai/Mistral-7B-v0.1. It was trained on various conversation datasets, including Tekken 7 and gaming, to provide more accurate and helpful responses in those domains.\n\nView API docs View model on Mistral AI Console\n\nThis model is a fine-tuned version of mistralai/Mistral-7B-v0.1. It was trained on various conversation datasets, including Tekken 7 and gaming, to provide more accurate and helpful responses in those domains. It is designed to help with conversations related to the game, including character strategies, move sets, and general gameplay tips.\n\n## Use Cases\n\n- Character Strategies: Get insights into the best strategies for playing specific Tekken 7 characters.\n- Move Sets: Learn about the move sets and combos for different characters.\n- Gameplay Tips: Receive tips and advice to improve your gameplay.\n- Community Discussions: Engage in discussions about the game, including updates, patches, and community events.\n\n## Prompt Examples\n\n- \"What are the best strategies for playing as Kazumi Mishima in Tekken 7?\"\n- \"Can you explain the move set for Paul Phoenix?\"\n- \"What are some tips for improving my defense in Tekken 7?\"\n- \"What are the latest updates and patches for Tekken 7?\""}}

{"finish_reason":"stop","index":0,"message":{"role":"assistant","content":"[English (en)]\n# Mistral AI\n\nMistral AI is a cutting-edge AI lab based in France, focused on developing advanced large language models and AI tools. Our team combines world-class researchers, engineers, and entrepreneurs from leading organizations such as Meta Platforms, Google DeepMind, and Google Brain, as well as top academic institutions like NYU, Cambridge, and Oxford.\n\n## Core Values\n\n- **Open & Honest**: We believe in transparency and honesty in all our actions.\n- **Collaborative & Kind**: We foster a culture of collaboration and kindness, where everyone feels valued and respected.\n- **Ambitious & Pragmatic**: We set high goals but remain grounded in practicality.\n- **Curious & Creative**: We encourage curiosity and creativity to drive innovation.\n\n## Projects\n\n- **Mistral Large**: A large-scale language model designed for high performance and efficiency.\n- **Le Chat**: An open-source conversational AI model.\n- **Mixtral**: An open-weight 8x7B sparse mixture of experts model.\n- **Codium**: An open-source AI model for code generation.\n- **Distil**: An open-source 40B parameter language model.\n\n## Research\n\nOur research spans a wide range of topics, including:\n\n- **Natural Language Processing**: Developing models that understand and generate human language.\n- **Machine Learning**: Advancing techniques for training and deploying AI models.\n- **AI Safety**: Ensuring that AI systems are safe, reliable, and trustworthy.\n\n## Community\n\nWe are committed to building a diverse and inclusive community of researchers, developers, and AI enthusiasts. We regularly host events, workshops, and hackathons to foster collaboration and learning.\n\n## Careers\n\nWe are always looking for talented individuals to join our team. If you are passionate about AI and want to make a difference, we encourage you to explore our career opportunities.\n\nFor more information, visit our website at [mistral.ai](https://mistral.ai)."}}

Vortex5

Owner Jan 7

Which API?

case-ai

Jan 7

Which API?

It's not a standardized web API, it's one I rolled as an interface for an experimental game - in this use case the API is talking to an HF endpoint, but it does work consistently with dozens of other models and several hosts. The endpoint runs the model on llama.cpp (standard settings across the board).

Vortex5

Owner Jan 7

Works fine in sillytavern, I run llama.cpp to and model doesn't work in webui, I am guessing it is missing a baked in chat template?

case-ai

Jan 7

That makes sense to me; I'm still just a consumer of models and haven't gotten hands-on with finetunes, etc, so I don't know all the steps of what's involved for them to work across platforms/engines. SillyTavern seems to handle a lot under the hood and might supply a default ChatML template. I'm not sure if there's a way to slot a template into the endpoint config, but I'll dig into it tonight, thanks!

case-ai

Jan 7

No luck with using any of the expected llama.cpp --chat-template <template> options based on the token config, that's pretty much all I could find as a potential solution without direct access to the environment (manual template file loading).

Vortex5

Owner Jan 7

No luck with using any of the expected llama.cpp --chat-template <template> options based on the token config, that's pretty much all I could find as a potential solution without direct access to the environment (manual template file loading).

I have tried this as well in the past. The 12B models seem to work fine on the llama webui, but for 24B models, based on my testing, those without a baked-in chat template did not work, while those with one did. I could be wrong for the next merge, I will set a chat template as that seems to be the issue couldn't hurt anyways. I am glad you commented as I think I have figured it out, it is hard to find good info sometimes for ai.

case-ai

Jan 7

Awesome, I will keep an eye open!

Vortex5 changed discussion status to closed Jan 8

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment