SykoLLM-v6.9 / README.md
SykoSLM's picture
Update README.md
4e29e2d verified
|
Raw
History Blame Contribute Delete
4.15 kB
---
language:
- en
license: apache-2.0
tags:
- causal-lm
- text-generation
- pretrained
- tpu
- sykollm
base_model: SykoSLM/SykoLLM-V6.8
---
# SykoLLM-V6.9
**The most powerful model in the SykoLLM family — trained on 8 billion tokens.**
SykoLLM-V6.9 is a 391M parameter causal language model, trained from scratch on a carefully curated mixture of high-quality English datasets. It is the latest and most capable model in the SykoLLM series, surpassing all previous versions in both token count and training quality.
---
## Model Details
| Property | Value |
|---|---|
| **Architecture** | Causal Language Model (Phi-3 based) |
| **Parameters** | 391,857,152 |
| **Context Length** | 1,024 tokens |
| **Vocabulary Size** | 50,000 |
| **Hidden Size** | 1,024 |
| **Intermediate Size** | 3,072 |
| **Layers** | 24 |
| **Attention Heads** | 8 (GQA: 2 KV heads) |
| **Precision** | bfloat16 |
| **Language** | English only |
---
## Training Details
| Property | Value |
|---|---|
| **Total Tokens** | ~8 Billion |
| **Training Steps** | 30,000 |
| **Effective Batch Size** | 256 (16 × 2 × 8 cores) |
| **Learning Rate** | 4e-4 (cosine decay) |
| **Optimizer** | Adafactor |
| **Hardware** | Google TPU v5e-8 |
| **Precision** | bfloat16 (XLA native) |
| **Weight Decay** | 0.05 |
| **Warmup Steps** | 200 |
---
## Training Data
SykoLLM-V6.9 was trained on a curated mixture of 4 high-quality datasets, interleaved with carefully tuned sampling probabilities:
| Dataset | Sampling | Description |
|---|---|---|
| [openbmb/Ultra-FineWeb](https://huggingface.co/datasets/openbmb/Ultra-FineWeb) | 25% | High-quality web text, scored and filtered |
| [openbmb/Ultra-FineWeb-L3](https://huggingface.co/datasets/openbmb/Ultra-FineWeb-L3) | 40% | Multi-style synthetic English pretraining data |
| [openbmb/UltraData-Math](https://huggingface.co/datasets/openbmb/UltraData-Math) | 20% | High-quality mathematical reasoning data |
| [openbmb/UltraChat](https://huggingface.co/datasets/openbmb/UltraChat) | 15% | Multi-turn conversational data |
All datasets were filtered with a quality score threshold of ≥ 0.85 and additional heuristic filters to remove low-quality, noisy, or excessively long samples.
---
## Chat Format
SykoLLM-V6.9 uses the following chat template:
```
<|user|>
Your message here<|end|>
<|assistant|>
Model response here<|end|>
```
For multi-turn conversations:
```
<|user|>
Hello, how are you?<|end|>
<|assistant|>
I'm doing great, thank you for asking!<|end|>
<|user|>
Can you help me with a math problem?<|end|>
<|assistant|>
Of course! What's the problem?<|end|>
```
---
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "SykoSLM/SykoLLM-V6.9"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
prompt = "<|user|>\nWhat is the capital of France?<|end|>\n<|assistant|>\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
top_p=0.9,
do_sample=True,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))
```
---
## SykoLLM Family
| Model | Tokens | Notes |
|---|---|---|
| SykoLLM-V6.9 | **~8B** | **Most powerful — current** |
| SykoLLM-V6.8 | <8B | Previous version |
| SykoLLM-V6.6 | <8B | Earlier version |
---
## Limitations
- **English only** — the model was trained exclusively on English data and does not support other languages.
- **Context length** is limited to 1,024 tokens.
- As a base pretrained model, it may produce outputs that are inaccurate, biased, or inappropriate. Use with appropriate safety measures.
- Not instruction-tuned — for best results, use the chat format described above.
---
## License
This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
---
*Trained with ❤️ by [SykoSLM](https://huggingface.co/SykoSLM)*