Configuration Parsing Warning:Config file config.json cannot be fetched (too big)

Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

LLMAdv — Qwen3-4B-Instruct-2507 NF4 Quantized Model

This repository contains a lightweight, 4-bit NF4 quantized version of Qwen3-4B-Instruct-2507, optimized for efficient inference on CPU and Intel GPU environments (e.g., Arc 140T, WSL2).
The model is prepared for competition submission and general-purpose Japanese/English instruction following.

Model Details

Base model: Qwen/Qwen3-4B-Instruct-2507
Quantization: bitsandbytes 4-bit (NF4)
Format: model.safetensors
Tokenizer: Qwen3 tokenizer (included)
Intended use:
- Lightweight inference
- Educational / research use
- Competition submission
- Japanese + English instruction following

Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "rehabiliworld1/LLMAdv"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

inputs = tokenizer("歩行周期を説明してください。", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 1

Safetensors

Model size

4B params

Tensor type

F32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support