Instructions to use Trendyol/Trendyol-LLM-8B-T1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Trendyol/Trendyol-LLM-8B-T1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Trendyol/Trendyol-LLM-8B-T1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Trendyol/Trendyol-LLM-8B-T1")
model = AutoModelForCausalLM.from_pretrained("Trendyol/Trendyol-LLM-8B-T1", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Trendyol/Trendyol-LLM-8B-T1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Trendyol/Trendyol-LLM-8B-T1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Trendyol/Trendyol-LLM-8B-T1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Trendyol/Trendyol-LLM-8B-T1

SGLang

How to use Trendyol/Trendyol-LLM-8B-T1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Trendyol/Trendyol-LLM-8B-T1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Trendyol/Trendyol-LLM-8B-T1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Trendyol/Trendyol-LLM-8B-T1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Trendyol/Trendyol-LLM-8B-T1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Trendyol/Trendyol-LLM-8B-T1 with Docker Model Runner:
```
docker model run hf.co/Trendyol/Trendyol-LLM-8B-T1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Trendyol LLM-8B-T1

Trendyol LLM-8B-T1 is an 8-billion-parameter chat model built on top of Qwen 3-8B using large-scale Turkish e-commerce datasets curated by Trendyol. The primary goal of the model is to provide advanced reasoning capabilities in Turkish.

Alongside this specialization in Turkish, the base model's strong English capabilities have also been preserved, making it an effective tool in both languages.

🔑 Highlights

Turkish reasoning – Robust chain-of-thought in Turkish across various domains.
English reasoning - Reasoning capability in English is preserved with English chain-of-thought examples.
Dual operation modes – /think (explicit reasoning) or /no_think (concise answers).
Multitask out-of-the-box
- Instruction following
- Summarisation & paraphrasing
- Coding related tasks
- Text & review classification (sentiment, category)
- Attribute/key-value extraction for catalogue enrichment
E-commerce tuned – domain vocabulary (fashion, electronics etc.).
Context length up to 32 k tokens natively.
Apache-2.0 licence – free for commercial and research use.

🚀 Quick-start

import transformers
import torch

model_id = "Trendyol/Trendyol-LLM-8B-T1"
pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={
        "torch_dtype": torch.bfloat16,
        "attn_implementation": "flash_attention_2",
        "device_map": "auto"
    }
)

messages = [
    {"role": "system", "content": "Sen yardımsever bir asistansın."},
    {"role": "user",   "content": "Mona Lisa tablosu hakkında kısa bir özet yazar mısın?"}
]

outputs = pipeline(messages, 
                   return_full_text=False,
                   max_new_tokens=2048
                  )
assistant_response = outputs[0]["generated_text"].strip()
print(assistant_response)

🛠️ Using `/think` & `/no_think`

/think – model emits a <think> … </think> block with its internal reasoning before the final answer. This is default behaviour.
/no_think – append this directive in the last user turn, if you want to turn off thinking mode.

Limitations, Risks, Bias, and Ethical Considerations

Limitations and Known Biases

Primary Function and Application: Trendyol LLM, an autoregressive language model, is primarily designed to predict the next token in a text string. While often used for various applications, it is important to note that it has not undergone extensive real-world application testing. Its effectiveness and reliability across diverse scenarios remain largely unverified.
Language Comprehension and Generation: The model is primarily trained in standard English and Turkish. Its performance in understanding and generating slang, informal language, or other languages may be limited, leading to potential errors or misinterpretations.
Generation of False Information: Users should be aware that Trendyol LLM may produce inaccurate or misleading information. Outputs should be considered as starting points or suggestions rather than definitive answers.

Risks and Ethical Considerations

Potential for Harmful Use: There is a risk that Trendyol LLM could be used to generate offensive or harmful language. We strongly discourage its use for any such purposes and emphasize the need for application-specific safety and fairness evaluations before deployment.
Unintended Content and Bias: The model was trained on a large corpus of text data, which was not explicitly checked for offensive content or existing biases. Consequently, it may inadvertently produce content that reflects these biases or inaccuracies.
Toxicity: Despite efforts to select appropriate training data, the model is capable of generating harmful content, especially when prompted explicitly. We encourage the open-source community to engage in developing strategies to minimize such risks.

Recommendations for Safe and Ethical Usage

Human Oversight: We recommend incorporating a human curation layer or using filters to manage and improve the quality of outputs, especially in public-facing applications. This approach can help mitigate the risk of generating objectionable content unexpectedly.
Application-Specific Testing: Developers intending to use Trendyol LLM should conduct thorough safety testing and optimization tailored to their specific applications. This is crucial, as the model’s responses can be unpredictable and may occasionally be biased, inaccurate, or offensive.
Responsible Development and Deployment: It is the responsibility of developers and users of Trendyol LLM to ensure its ethical and safe application. We urge users to be mindful of the model's limitations and to employ appropriate safeguards to prevent misuse or harmful consequences.

📜 Licence

Apache-2.0 – identical to the base Qwen 3-8B.

📖 Citation

@misc{trendyolLLM8BT1,
  title   = {Trendyol LLM 8B T1},
  author  = {Trendyol LLM & Core NLP Team},
  year    = {2025},
  url     = {https://huggingface.co/Trendyol/Trendyol-LLM-8B-T1}
}