| | --- |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | tags: |
| | - mining |
| | - awq |
| | license: cc-by-nc-sa-4.0 |
| | language: |
| | - ru |
| | base_model: nn-tech/MetalGPT-1 |
| | --- |
| | |
| | ## Description |
| |
|
| | **MetalGPT-1** is a model built upon the Qwen/Qwen3-32B and incorporates both continual pre-training and supervised fine-tuning on domain-specific data from the mining and metallurgy industry. |
| |
|
| | --- |
| |
|
| | ### Quantization |
| |
|
| | For convenience and better efficiency, we also offer this AWQ-quantized checkpoint of the nn-tech/MetalGPT-1 model. Using AWQ 4-bit quantization greatly speeds up inference and reduces memory consumption, without significant impact on quality. |
| |
|
| | --- |
| |
|
| | ### HF Usage |
| |
|
| | ```python |
| | |
| | from awq import AutoAWQForCausalLM |
| | from transformers import AutoTokenizer |
| | import torch |
| | |
| | torch.manual_seed(42) |
| | |
| | model_name = "nn-tech/MetalGPT-1-AWQ" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True) |
| | model = AutoAWQForCausalLM.from_quantized( |
| | model_name, |
| | device_map="auto", |
| | ) |
| | |
| | messages=[ |
| | {"role": "system", "content": "Ты специалист в области металлургии."}, |
| | {"role": "user", "content": "Назови плюсы и минусы хлоридной и сульфатной технологии производства никеля."} |
| | ] |
| | |
| | text = tokenizer.apply_chat_template( |
| | messages, |
| | tokenize=False, |
| | add_generation_prompt=True, |
| | # enable_thinking=False |
| | ) |
| | |
| | device = next(model.parameters()).device |
| | model_inputs = tokenizer([text], return_tensors="pt").to(device) |
| | |
| | generated_ids = model.generate( |
| | **model_inputs, |
| | max_new_tokens=1024, |
| | do_sample=True, |
| | temperature=0.7, |
| | ) |
| | |
| | # Обрезаем префикс промпта |
| | generated_ids = [ |
| | output_ids[len(input_ids):] |
| | for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) |
| | ] |
| | |
| | response = tokenizer.batch_decode( |
| | generated_ids, |
| | skip_special_tokens=True |
| | )[0] |
| | |
| | print(response) |
| | |
| | ``` |
| |
|
| | --- |
| |
|
| | ### VLLM usage |
| |
|
| | ```bash |
| | vllm serve nn-tech/MetalGPT-1-AWQ --reasoning-parser qwen3 |
| | |
| | ``` |
| |
|
| | ```python |
| | |
| | from openai import OpenAI |
| | |
| | client = OpenAI( |
| | base_url="http://localhost:8000/v1", |
| | api_key="dummy" |
| | ) |
| | |
| | response = client.chat.completions.create( |
| | model="nn-tech/MetalGPT-1-AWQ", |
| | messages=[ |
| | {"role": "system", "content": "Ты специалист в области металлургии."}, |
| | {"role": "user", "content": "Назови плюсы и минусы хлоридной и сульфатной технологии производства никеля."} |
| | ], |
| | temperature=0.7, |
| | max_tokens=1024 |
| | ) |
| | |
| | print(response.choices[0].message.content) |
| | |
| | ``` |
| |
|
| |
|