Instructions to use serda-dev/mamba-130m-hf-turkish with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use serda-dev/mamba-130m-hf-turkish with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="serda-dev/mamba-130m-hf-turkish")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("serda-dev/mamba-130m-hf-turkish")
model = AutoModelForCausalLM.from_pretrained("serda-dev/mamba-130m-hf-turkish")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use serda-dev/mamba-130m-hf-turkish with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "serda-dev/mamba-130m-hf-turkish"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "serda-dev/mamba-130m-hf-turkish",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/serda-dev/mamba-130m-hf-turkish

SGLang

How to use serda-dev/mamba-130m-hf-turkish with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "serda-dev/mamba-130m-hf-turkish" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "serda-dev/mamba-130m-hf-turkish",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "serda-dev/mamba-130m-hf-turkish" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "serda-dev/mamba-130m-hf-turkish",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use serda-dev/mamba-130m-hf-turkish with Docker Model Runner:
```
docker model run hf.co/serda-dev/mamba-130m-hf-turkish
```

⚠️ Notice / Duyuru (applies to both repos / iki repo için geçerli)

(13.02.2026 11PM (UTC+3))

EN: This announcement is identical for mamba-130m-hf-turkish and mamba-370m-hf-turkish — you don’t need to check the other repository separately.
TR: Bu duyuru mamba-130m-hf-turkish ve mamba-370m-hf-turkish için aynıdır — diğer repoyu ayrıca kontrol etmenize gerek yoktur.

EN — Details

Due to an embedding-related incompatibility in mamba-130m-hf-turkish, the current text generation behavior may be buggy (unstable or inconsistent outputs). This issue has been fixed in mamba-370m-hf-turkish, and we will port the same fix to mamba-130m-hf-turkish as soon as possible.

Overall, Turkish fluency and grammar are generally solid, but logical/contextual consistency issues remain because of current dataset limitations. We are continuing to improve the dataset pipeline (ongoing web scraping and cleaning). When mamba-2.8b-hf-turkish is ready, we plan to retrain and re-release the Turkish checkpoints together using the improved dataset. Until then, we recommend using these models mainly via fine-tuning, and we appreciate any additional bug reports.

TR — Detaylar

mamba-130m-hf-turkish modelinde embedding tarafındaki bir uyumsuzluk nedeniyle mevcut text generation davranışı zaman zaman buglu çalışabiliyor (çıktılar tutarsız/kararsız olabiliyor). Bu sorun mamba-370m-hf-turkish modelinde çözüldü ve aynı düzeltmeyi mamba-130m-hf-turkish reposuna da en kısa sürede aktaracağız.

Genel olarak Türkçe akıcılık ve gramer tarafı iyi; ancak mevcut dataset kısıtları nedeniyle mantıksal bağlam ve tutarlılık problemleri hâlâ görülebilir. Dataset hattını (web scrape + temizlik) iyileştirmeye devam ediyoruz. mamba-2.8b-hf-turkish hazır olduğunda, geliştirilmiş dataset ile Türkçe checkpoint’leri birlikte yeniden eğitip yeniden yayınlamayı planlıyoruz. O zamana kadar modelleri ağırlıklı olarak fine-tune ederek kullanmanızı öneririz; ek hataları bildirmekten çekinmeyin.

Turkish Continued Pretraining of `mamba-130m-hf`

This repository provides a Turkish continued-pretrained variant of state-spaces/mamba-130m-hf (Transformers-compatible Mamba 130M). The goal is to improve Turkish fluency and local domain robustness while keeping the original architecture and HF usage experience.

Developed by the LinguAI Team, affiliated with KTUN and the YAZGİT community.

What is Mamba?

Mamba is a selective State Space Model (SSM) architecture designed for efficient sequence modeling with linear-time scaling in sequence length. It was introduced by Gu & Dao in “Mamba: Linear-Time Sequence Modeling with Selective State Spaces”.

Training summary (this checkpoint)

Base model: state-spaces/mamba-130m-hf
Training type: Continued pretraining (CPT) / domain-adaptation pretraining for Turkish
Hardware: Single GPU NVIDIA GeForce RTX 4060 Laptop GPU
Raw text used: ~400 MB Turkish text (after your preprocessing)
Approx token estimate: ~80M–120M tokens (rule-of-thumb: ~3–5 bytes/characters per token depending on tokenizer + text composition; 400MB typically lands around this band)

Notes on the token estimate: without the exact tokenizer statistics (total input_ids count) and exact encoding (UTF-8 composition, whitespace density, punctuation rate), the most honest way is a range. If you want, you can compute the exact value by summing tokenized lengths across your dataset shards and update this section.

Intended use

Turkish text generation (base LM behavior; not instruction-tuned)
Turkish domain adaptation for downstream fine-tuning (LoRA / full fine-tune)
Experimentation with SSM-based backbones in transformers

Not intended for:

Safety-critical decisions
Legal/medical advice
“Chat assistant” behavior out of the box (this is a base causal LM; you’ll need instruction tuning + safety alignment for assistant-like use)

Quickstart

Install requirements (recommended)

The original publisher recommends installing transformers from main (historically required until a given release), plus the optimized CUDA-kernel dependencies for best performance: causal-conv1d and mamba-ssm.

pip install git+https://github.com/huggingface/transformers@main
pip install "causal-conv1d>=1.2.0"
pip install mamba-ssm

If causal-conv1d and/or mamba-ssm are not installed, Transformers will fall back to an “eager” implementation; with them installed, it can use optimized CUDA kernels when available. ([Hugging Face][1])

Usage (generation)

Below is the standard transformers generate workflow used by the upstream model card.

import torch
from transformers import AutoTokenizer, MambaForCausalLM

MODEL_ID = "serda-dev/mamba-130m-hf-turkish"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = MambaForCausalLM.from_pretrained(MODEL_ID)

prompt = "Türkiye'de yazılım mühendisi olmak hakkında kısa bir paragraf yaz:"
inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=120,
        do_sample=True,
        temperature=0.9,
        top_p=0.95,
    )

print(tokenizer.decode(out[0], skip_special_tokens=True))

Tips

For fastest inference on NVIDIA GPUs, ensure CUDA kernels are enabled by installing mamba-ssm + causal-conv1d.
If you run into build issues for these packages, double-check:
- Your PyTorch CUDA build matches your driver/runtime
- You have a compiler toolchain (e.g., build-essential) on Linux
- You’re using a compatible Python version

Fine-tuning (PEFT / LoRA)

The upstream model card includes a PEFT fine-tuning example and recommends keeping the model in float32 during finetuning in that example context.

High-level LoRA recipe:

Keep LR conservative for CPT-adapted models if your dataset is small
Target Mamba projection modules similarly to upstream suggestions (e.g., x_proj, in_proj, out_proj, embeddings)
Validate perplexity on a held-out Turkish set

(If you want, you can paste your exact training script + config and I’ll write a “Reproducibility” section with command lines and hyperparameters.)

Evaluation (what to check)

For a CPT’d base LM, common quick checks:

Perplexity on a held-out Turkish slice
Qualitative prompts: news style, conversational Turkish, formal writing, domain slang
Degeneration: repetition, short loops, weird token fragments
Catastrophic forgetting: basic English capability (if you care)

Limitations & safety

This is a base language model (not instruction-tuned). It may:
- Hallucinate facts
- Produce biased or unsafe text
- Reflect issues present in training data
Use standard filtering and safety layers for deployments.

Acknowledgements (upstream credit)

This model is a continued-pretrained derivative of state-spaces/mamba-130m-hf. The installation and usage instructions above are based on the upstream Hugging Face model card for Transformers-compatible Mamba.

Mamba architecture reference:

Albert Gu, Tri Dao. Mamba: Linear-Time Sequence Modeling with Selective State Spaces.

Citation

If you use this model in academic work, please cite the Mamba paper:

@article{gu2023mamba,
  title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
  author={Gu, Albert and Dao, Tri},
  journal={arXiv preprint arXiv:2312.00752},
  year={2023}
}

Also consider citing the upstream HF checkpoint:

state-spaces/mamba-130m-hf

Team & Affiliations

This model was developed by the LinguAI Team, an independent research-oriented AI team affiliated with Konya Technical University (KTUN) and operating under the YAZGİT community.

LinguAI Team Members (Core Contributors):

Ahmet Furkan Kalle
Alican Tanyeri
Baris Icoz
Behlul Bera Anik
Murat Serda Çelik

Downloads last month: 3

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for serda-dev/mamba-130m-hf-turkish

Base model

state-spaces/mamba-130m-hf

Finetuned

(16)

this model

Paper for serda-dev/mamba-130m-hf-turkish

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 152