Cydonia-24B-v4.3 - OpenVINO INT4 (Stateful)

This is a Stateful OpenVINO conversion of TheDrummer/Cydonia-24B-v4.3.

It was converted specifically to ensure compatibility with OpenVINO GenAI and OpenArc (Pipeline Parallel execution on Intel GPUs).

πŸ”§ Conversion Details

  • Precision: INT4
  • Group Size: 128
  • Mode: Stateful (Includes beam_idx and internal KV-cache management)
  • Tokenizer: Fixed (Applied fix_mistral_regex=True to resolve tokenization issues with Mistral-Small)

πŸ’» Compatibility

This model has been tested and verified on:

  • Hardware: Dual Intel Arc A770 (16GB) via OpenArc (Pipeline Parallel).
  • Performance: Significantly higher T/s compared to Vulkan/SYCL backends.

πŸš€ How to Run (Python OpenVINO GenAI)

import openvino_genai as ov_genai

# Point to the folder where you downloaded this model
pipe = ov_genai.LLMPipeline("./Cydonia-24B-v4.3-OpenVINO-INT4", "CPU") # or "GPU"

print(pipe.generate("Explain quantum mechanics in one sentence.", max_new_tokens=100))
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for DudePls/Cydonia-24B-v4.3-OpenVINO-INT4