Configuration Parsing Warning: Config file config.json cannot be fetched (too big)
Configuration Parsing Warning: Config file tokenizer_config.json cannot be fetched (too big)
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

LLMAdv β€” Qwen3-4B-Instruct-2507 NF4 Quantized Model

This repository contains a lightweight, 4-bit NF4 quantized version of Qwen3-4B-Instruct-2507, optimized for efficient inference on CPU and Intel GPU environments (e.g., Arc 140T, WSL2).
The model is prepared for competition submission and general-purpose Japanese/English instruction following.


Model Details

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Quantization: bitsandbytes 4-bit (NF4)
  • Format: model.safetensors
  • Tokenizer: Qwen3 tokenizer (included)
  • Intended use:
    • Lightweight inference
    • Educational / research use
    • Competition submission
    • Japanese + English instruction following

Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "rehabiliworld1/LLMAdv"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

inputs = tokenizer("ζ­©θ‘Œε‘¨ζœŸγ‚’θͺ¬ζ˜Žγ—てください。", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
-
Safetensors
Model size
4B params
Tensor type
F32
Β·
BF16
Β·
U8
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support