thumbnail

Model Card for IoGPT-Instruct

IoGPT-Instruct is a fine-tuned generative text model developed by GalaxyMindAiLabs, built upon the powerful Mistral-Small-3.2-24B-Instruct-2506 architecture.

It is designed to handle complex instructions with high reasoning capabilities while maintaining a user-friendly and engaging tone. The model supports multilingual capabilities including Polish, Chinese, Russian, English, Abkhazian, and Korean.

Model Details

Model Description

This model was trained to improve accuracy in responses requiring precise information, leveraging the strong 24B parameter base of Mistral Small.

Quick Start

Option 1: You can use this model with the Hugging Face transformers library.

import torch
from transformers import AutoProcessor, AutoModelForImageTextToText
from accelerate import Accelerator

torch_device = Accelerator().device
model_checkpoint = "galaxyMindAiLabs/IoGPT-A1-Instruct"
processor = AutoProcessor.from_pretrained(model_checkpoint)
model = AutoModelForImageTextToText.from_pretrained(model_checkpoint,torch_dtype=torch.bfloat16,  device_map="auto")
user_prompt = "Why sky is blue?"

messages = [
    {"role": "user", "content": user_prompt},
]

text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(0, dtype=torch.float16)
generate_ids = model.generate(**inputs, max_new_tokens=5000, do_sample=True) # We recommend always setting True to avoid hallucinations.
decoded_output = processor.batch_decode(generate_ids[:, inputs["input_ids"].shape[1] :], skip_special_tokens=True)[0]

print(decoded_output)

Option 2: Using Unsloth (Faster Inference) Since this model was trained with Unsloth, using their library provides 2x faster inference.

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "galaxyMindAiLabs/IoGPT-A1-Instruct",
    max_seq_length = 2048,
    dtype = None,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

inputs = tokenizer(
[
    "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 128, use_cache = True)
print(tokenizer.batch_decode(outputs))

Training Procedure This model was trained using Unsloth and TRL (Transformer Reinforcement Learning). Key Improvements: * Reasoning: Enhanced logical consistency inherited from the 24B Mistral base. * Tone: Fine-tuned for a helpful, polite, and precise assistant persona. * Multilingualism: Improved handling of diverse languages listed above. Framework Versions * TRL: 0.24.0 * Transformers: 4.57.6 * Pytorch: 2.10.0 * Datasets: 4.3.0 * Tokenizers: 0.22.2

Citations

@misc{vonwerra2022trl,
    title        = {{TRL: Transformer Reinforcement Learning}},
    author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
    year         = 2020,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{[https://github.com/huggingface/trl](https://github.com/huggingface/trl)}}
}
Downloads last month
-
Safetensors
Model size
24B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for galaxyMindAiLabs/IoGPT-A1-Instruct

Collection including galaxyMindAiLabs/IoGPT-A1-Instruct