Text Generation
Transformers
Safetensors
qwen3
turkish
türkiye
reasoning
ai
lamapi
gemma3
next
next-x1
open-source
32b
large-language-model
llm
transformer
artificial-intelligence
machine-learning
nlp
multilingual
instruction-tuned
chat
generative-ai
optimized
trl
sft
cognitive
analytical
enterprise
industrial
conversational
text-generation-inference
4-bit precision
bitsandbytes
Update README.md
Browse files
README.md
CHANGED
|
@@ -147,28 +147,28 @@ Designed for high-demand enterprise environments, **Next 32B** delivers superior
|
|
| 147 |
**Note:** Due to the model size, we recommend using a GPU with at least 24GB VRAM (for 4-bit quantization) or 48GB+ (for 8-bit/FP16).
|
| 148 |
|
| 149 |
```python
|
| 150 |
-
from
|
| 151 |
-
import torch
|
| 152 |
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 156 |
-
model = AutoModelForCausalLM.from_pretrained(
|
| 157 |
-
model_id,
|
| 158 |
-
torch_dtype=torch.float16,
|
| 159 |
-
device_map="auto"
|
| 160 |
-
)
|
| 161 |
|
| 162 |
messages = [
|
| 163 |
-
{"role": "system", "content": "You are Next-X1, an
|
| 164 |
-
{"role": "user", "content": "Analyze the potential long-term economic impacts of AI on emerging markets using a dialectical approach."}
|
| 165 |
]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 166 |
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
|
|
|
|
|
|
| 172 |
```
|
| 173 |
|
| 174 |
---
|
|
|
|
| 147 |
**Note:** Due to the model size, we recommend using a GPU with at least 24GB VRAM (for 4-bit quantization) or 48GB+ (for 8-bit/FP16).
|
| 148 |
|
| 149 |
```python
|
| 150 |
+
from unsloth import FastLanguageModel
|
|
|
|
| 151 |
|
| 152 |
+
model, tokenizer = FastLanguageModel.from_pretrained("Lamapi/next-32b-4bit")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 153 |
|
| 154 |
messages = [
|
| 155 |
+
{"role": "system", "content": "You are Next-X1, an AI assistant created by Lamapi. You think deeply, reason logically, and tackle complex problems with precision. You are an helpful, smart, kind, concise AI assistant."},
|
| 156 |
+
{"role" : "user", "content" : "Analyze the potential long-term economic impacts of AI on emerging markets using a dialectical approach."}
|
| 157 |
]
|
| 158 |
+
text = tokenizer.apply_chat_template(
|
| 159 |
+
messages,
|
| 160 |
+
tokenize = False,
|
| 161 |
+
add_generation_prompt = True,
|
| 162 |
+
enable_thinking = True,
|
| 163 |
+
)
|
| 164 |
|
| 165 |
+
from transformers import TextStreamer
|
| 166 |
+
_ = model.generate(
|
| 167 |
+
**tokenizer(text, return_tensors = "pt").to("cuda"),
|
| 168 |
+
max_new_tokens = 1024, # Increase for longer outputs!
|
| 169 |
+
temperature = 0.7, top_p = 0.95, top_k = 400,
|
| 170 |
+
streamer = TextStreamer(tokenizer, skip_prompt = True),
|
| 171 |
+
)
|
| 172 |
```
|
| 173 |
|
| 174 |
---
|