Instructions to use SykoSLM/SykoLLM-V6.0-Test with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SykoSLM/SykoLLM-V6.0-Test with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SykoSLM/SykoLLM-V6.0-Test")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SykoSLM/SykoLLM-V6.0-Test")
model = AutoModelForCausalLM.from_pretrained("SykoSLM/SykoLLM-V6.0-Test")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use SykoSLM/SykoLLM-V6.0-Test with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SykoSLM/SykoLLM-V6.0-Test"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SykoSLM/SykoLLM-V6.0-Test",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/SykoSLM/SykoLLM-V6.0-Test

SGLang

How to use SykoSLM/SykoLLM-V6.0-Test with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SykoSLM/SykoLLM-V6.0-Test" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SykoSLM/SykoLLM-V6.0-Test",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SykoSLM/SykoLLM-V6.0-Test" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SykoSLM/SykoLLM-V6.0-Test",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use SykoSLM/SykoLLM-V6.0-Test with Docker Model Runner:
```
docker model run hf.co/SykoSLM/SykoLLM-V6.0-Test
```

SykoLLM-V6.0-Test

File size: 2,998 Bytes

0293ee2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83c79ff
8a7114c
83c79ff
 
8a7114c
83c79ff
8a7114c
83c79ff
 
8a7114c
83c79ff
 
8a7114c
83c79ff
 
8a7114c
83c79ff
8a7114c
83c79ff
8a7114c
83c79ff
 
 
 
 
8a7114c
83c79ff
 
8a7114c
83c79ff
 
8a7114c
83c79ff
8a7114c
83c79ff
 
dc6d707
8a7114c
83c79ff
8a7114c
83c79ff

---
license: apache-2.0
language:
- tr
- en
base_model: SykoSLM/SykoLLM-V5.9-Mini
pipeline_tag: text-generation
library_name: transformers
tags:
- nlp
- code
- phi3
- depth-up-scaling
- untrained
---

# SykoLLM-V6.0-Test

## Model Overview
**SykoLLM-V6.0-Test** is an up-scaled and structurally expanded version of the previous SykoLLM models. Developed by **SykoSLM**, this model is currently in the experimental/testing phase. 

The primary objective of this release is to provide a structurally larger foundation model by expanding both the depth (number of layers) and the width (intermediate size / MLP capacity) of the previous architecture, without losing the pre-trained knowledge.

## Architectural Expansion (Up-Scaling)
In order to overcome the "Knowledge Interference" (capacity bottleneck) observed in previous iterations, significant architectural changes have been applied to this model:

* **Depth Up-Scaling (DUS):** The number of hidden layers has been increased to **24**. This was achieved by carefully duplicating and mapping the existing layers to preserve the logical and syntactic capabilities of the model.
* **Width Expansion (MLP Scaling):** The `intermediate_size` has been expanded to **3072**. To prevent catastrophic forgetting, the newly added weights in the feed-forward networks were initialized with exact zero (`0.0`). This ensures that the newly added parameters act as identity functions during the initial forward pass.

## ⚠️ Important Notice: Status of the Model
**This model is currently UNTRAINED on the newly added parameters.** 

It has been expanded solely to save pre-training time and preserve existing knowledge. While the model retains the capabilities of its predecessor, the newly added parameters (~100M+ new parameters) are currently dormant (zeroed out). 

To fully utilize the expanded capacity and activate the new parameters, **fine-tuning is required**. If used in its current state, the model will function similarly to the previous smaller version, as the new structural capacity has not yet been fine-tuned on new or existing datasets.

## Why This Approach?
Training a Large Language Model from scratch requires immense computational resources and time. By utilizing Net2Net (Knowledge Distillation) principles:
1. We preserve the billions of tokens worth of knowledge already embedded in the model.
2. We provide the model with a much larger "encyclopedic" memory (MLP expansion) to prevent data overlapping and hallucination.
3. We drastically reduce the time required to achieve a higher parameter count.

## Usage
You can load the model using the `transformers` library, but please keep in mind that it requires further fine-tuning for optimal performance.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "SykoSLM/SykoLLM-V6.0-Test"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto")
```

---

Developed by SykoSLM