---
language:
- tr
license: apache-2.0
tags:
- gpt2
- turkish
- instruct
- thinking
model_name: SykoLLM-V2.4-Thinking-Beta
base_model: syko818121/SykoLLM-V2.3-Turkish-Instruct
model_type: causal-lm
parameters: ~96.1M params
datasets:
- Quardo/wikipedia-turkish-qa-chattemplate
---

# SykoLLM-V2.4-Thinking-Beta 

This is the latest and most experimental version of the **SykoLLM** series. Developed and trained entirely by **Burak (15 years old)**, this model is designed to explore "Chain of Thought" (CoT) capabilities in small-scale Turkish Language Models.

## Important Technical Distinction
**This model is **NOT a LoRA adapter** or a **simple copy of GPT-2.** It is a standalone, full-parameter fine-tuned model where the actual weights have been modified through training. The positional embeddings were manually expanded from 512 to 1024 tokens via a custom "weight surgery" process to support longer context natively.

## ⚠️ Important: Beta Status
This model is currently in a **strict Beta phase**. 
- The training for the "thinking" mechanism is still ongoing/experimental.
- **Note:** The model has been introduced to new special tokens, but it has not yet fully mastered the logic of "thinking" before answering. 
- Users should expect inconsistent results regarding the use of `<thinking>` tags during this stage.

## Model Specifications
- **Model Name:** SykoLLM-V2.4-Thinking-Beta
- **Parameter Count:** ~96.1M params (Lightweight and fast)
- **Vocabulary Size:** 50,000 (Custom tokenizer optimized for Turkish)
- **Context Window:** 1024 Tokens (Expanded from the original 512 via positional embedding surgery)
- **Architecture:** GPT-2 based with modified positional embeddings to support longer context.

##  Experimental "Thinking" Tokens
Special tokens have been added to the tokenizer to prepare the model for reasoning tasks:
- `<thinking>`: Intended for the model's internal reasoning process.
- `</thinking>`: End of the reasoning process.
- `<bos>` / `<eos>`: Beginning and end of string tokens.

## 📊 Training Insights
The model was pre-trained on Turkish Wikipedia and fine-tuned on instruction datasets. 
- **Learning Rate:** 2e-5
- **Optimizer:** AdamW with a Cosine Scheduler.
- **Batch Size:** 32 (Effective batch size via Gradient Accumulation)
- **Loss Trend:** Started at ~8.0 and successfully converged to ~3.5 during current training runs.

##  About the Developer
SykoLLM-V2.4 is part of an ongoing project by Burak a AI enthusiast. The goal of this project is to demonstrate that small-scale models (under 100M parameters) can be fine-tuned to handle complex Turkish language structures and reasoning patterns.

##  License
Apache 2.0