SykoSLM
/

SykoLLM-V1.2-Turkish-Instruct

Text Generation

text-generation-inference

Model card Files Files and versions

SykoLLM-V1.2-Turkish-Instruct / README.md

burak

Update README.md

95e4339 verified 3 months ago

|

history blame contribute delete

2.54 kB

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- TFLai/Turkish-Alpaca
	language:
	- tr
	---

	# Model Card: SykoLLM-V1-Turkish

	SykoLLM-V1.2-Turkish-Instruct is a custom-architected, lightweight Large Language Model (LLM) designed specifically for Turkish conversational tasks. Unlike standard pre-built models, this version features a custom configuration optimized for speed and efficiency in low-resource environments.

	## Model Description

	* Developed by: syko818121
	* Model Name: SykoLLM-V1.2-Turkish-Instruct
	* Model Type: Causal Decoder-Only Custom Architecture
	* Language: Turkish
	* Parameters: ~50.8 Million
	* Training Data: Turkish Wikipedia + Custom High-Quality Chat Dataset

	## Architectural Specs

	This model uses a custom configuration designed for Turkish linguistics:

	* Vocabulary Size: 50,257
	* Hidden Dimension (n_embd): 512
	* Number of Layers: 8
	* Attention Heads: 8
	* Context Window: 512 tokens

	## Fine-Tuning & Conversation Style

	The model was fine-tuned on a high-quality, curated Turkish dataset to ensure natural, human-like responses. The training data distribution was carefully balanced:

	*
	Greetings & Daily Talk (40%): Natural openings and casual conversation.


	*
	Direct Question-Answering (30%): Short and concise answers to general knowledge queries.


	*
	Brief Explanations (20%): Simplified definitions for complex concepts.


	*
	Slang & Short Inputs (10%): Robustness against one-word or incomplete messages.



	## Usage

	You can load and test SykoLLM-V1-Turkish using the following snippet:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "syko818121/SykoLLM-V1-Turkish"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)

	prompt = "<user> Selam, naber?<assistant>"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))

	```

	## Training Configuration

	* Learning Rate: 5e-5
	*
	Scheduler: Cosine


	* Epochs: 15
	* Batch Size: 4
	* Precision: FP16 (Mixed Precision)

	## Limitations

	* Size: As a 50.8M parameter model, it is a "micro-LLM." It excels at short chats but may hallucinate on highly complex logical tasks.
	* Response Length: The model is intentionally biased toward concise and direct answers rather than long-form essays.

	---