Cydonia-24B-v4.3 - OpenVINO INT4 (Stateful)

This is a Stateful OpenVINO conversion of TheDrummer/Cydonia-24B-v4.3.

It was converted specifically to ensure compatibility with OpenVINO GenAI and OpenArc (Pipeline Parallel execution on Intel GPUs).

🔧 Conversion Details

Precision: INT4
Group Size: 128
Mode: Stateful (Includes beam_idx and internal KV-cache management)
Tokenizer: Fixed (Applied fix_mistral_regex=True to resolve tokenization issues with Mistral-Small)

💻 Compatibility

This model has been tested and verified on:

Hardware: Dual Intel Arc A770 (16GB) via OpenArc (Pipeline Parallel).
Performance: Significantly higher T/s compared to Vulkan/SYCL backends.

🚀 How to Run (Python OpenVINO GenAI)

import openvino_genai as ov_genai

# Point to the folder where you downloaded this model
pipe = ov_genai.LLMPipeline("./Cydonia-24B-v4.3-OpenVINO-INT4", "CPU") # or "GPU"

print(pipe.generate("Explain quantum mechanics in one sentence.", max_new_tokens=100))

Downloads last month: 3

Model tree for DudePls/Cydonia-24B-v4.3-OpenVINO-INT4

Base model

mistralai/Mistral-Small-3.1-24B-Base-2503

Finetuned

mistralai/Mistral-Small-3.2-24B-Instruct-2506

Finetuned

TheDrummer/Cydonia-24B-v4.3

Finetuned

(9)

this model