Instructions to use SykoSLM/SykoLLM-V6.0-Test with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SykoSLM/SykoLLM-V6.0-Test with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SykoSLM/SykoLLM-V6.0-Test")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SykoSLM/SykoLLM-V6.0-Test")
model = AutoModelForCausalLM.from_pretrained("SykoSLM/SykoLLM-V6.0-Test")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use SykoSLM/SykoLLM-V6.0-Test with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SykoSLM/SykoLLM-V6.0-Test"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SykoSLM/SykoLLM-V6.0-Test",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/SykoSLM/SykoLLM-V6.0-Test

SGLang

How to use SykoSLM/SykoLLM-V6.0-Test with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SykoSLM/SykoLLM-V6.0-Test" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SykoSLM/SykoLLM-V6.0-Test",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SykoSLM/SykoLLM-V6.0-Test" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SykoSLM/SykoLLM-V6.0-Test",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use SykoSLM/SykoLLM-V6.0-Test with Docker Model Runner:
```
docker model run hf.co/SykoSLM/SykoLLM-V6.0-Test
```

SykoLLM-V6.0-Test / README.md

SykoSLM

Update README.md

dc6d707 verified 24 days ago

preview code

raw

history blame contribute delete

3 kB

	---
	license: apache-2.0
	language:
	- tr
	- en
	base_model: SykoSLM/SykoLLM-V5.9-Mini
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- nlp
	- code
	- phi3
	- depth-up-scaling
	- untrained
	---

	# SykoLLM-V6.0-Test

	## Model Overview
	SykoLLM-V6.0-Test is an up-scaled and structurally expanded version of the previous SykoLLM models. Developed by SykoSLM, this model is currently in the experimental/testing phase.

	The primary objective of this release is to provide a structurally larger foundation model by expanding both the depth (number of layers) and the width (intermediate size / MLP capacity) of the previous architecture, without losing the pre-trained knowledge.

	## Architectural Expansion (Up-Scaling)
	In order to overcome the "Knowledge Interference" (capacity bottleneck) observed in previous iterations, significant architectural changes have been applied to this model:

	* Depth Up-Scaling (DUS): The number of hidden layers has been increased to 24. This was achieved by carefully duplicating and mapping the existing layers to preserve the logical and syntactic capabilities of the model.
	* Width Expansion (MLP Scaling): The `intermediate_size` has been expanded to 3072. To prevent catastrophic forgetting, the newly added weights in the feed-forward networks were initialized with exact zero (`0.0`). This ensures that the newly added parameters act as identity functions during the initial forward pass.

	## ⚠️ Important Notice: Status of the Model
	This model is currently UNTRAINED on the newly added parameters.

	It has been expanded solely to save pre-training time and preserve existing knowledge. While the model retains the capabilities of its predecessor, the newly added parameters (~100M+ new parameters) are currently dormant (zeroed out).

	To fully utilize the expanded capacity and activate the new parameters, fine-tuning is required. If used in its current state, the model will function similarly to the previous smaller version, as the new structural capacity has not yet been fine-tuned on new or existing datasets.

	## Why This Approach?
	Training a Large Language Model from scratch requires immense computational resources and time. By utilizing Net2Net (Knowledge Distillation) principles:
	1. We preserve the billions of tokens worth of knowledge already embedded in the model.
	2. We provide the model with a much larger "encyclopedic" memory (MLP expansion) to prevent data overlapping and hallucination.
	3. We drastically reduce the time required to achieve a higher parameter count.

	## Usage
	You can load the model using the `transformers` library, but please keep in mind that it requires further fine-tuning for optimal performance.

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "SykoSLM/SykoLLM-V6.0-Test"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto")
	```

	---

	Developed by SykoSLM