Instructions to use Tapask/telecom-oss-8b-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Tapask/telecom-oss-8b-merged with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Tapask/telecom-oss-8b-merged")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Tapask/telecom-oss-8b-merged")
model = AutoModelForCausalLM.from_pretrained("Tapask/telecom-oss-8b-merged")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Tapask/telecom-oss-8b-merged with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Tapask/telecom-oss-8b-merged"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tapask/telecom-oss-8b-merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Tapask/telecom-oss-8b-merged

SGLang

How to use Tapask/telecom-oss-8b-merged with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Tapask/telecom-oss-8b-merged" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tapask/telecom-oss-8b-merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Tapask/telecom-oss-8b-merged" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tapask/telecom-oss-8b-merged",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Tapask/telecom-oss-8b-merged with Docker Model Runner:
```
docker model run hf.co/Tapask/telecom-oss-8b-merged
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Telecom OSS/BSS Domain LLM (Merged Standalone)

Built with Meta Llama 3.

A standalone 8B model merging the Tapask/telecom-oss-8b LoRA adapter into its base AliMaatouk/LLama-3-8B-Tele. Specialised for TMF Frameworx (eTOM, SID, Open APIs) and OSS/BSS telecom operations. No PEFT runtime required — load and use like any Llama-3 model.

Two flavours of the same fine-tune:

Standalone (this repo) — single load, simpler for inference
Adapter-only — 670 MB, needs the base model at load time (smaller download)

Model summary


Architecture	Llama-3 8B (transformers-native, fp16 safetensors)
Origin	`AliMaatouk/LLama-3-8B-Tele` + QLoRA fine-tune (r=64, α=128, dropout=0.05)
Fine-tune target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
Training data	18,779 synthetic instruction–response pairs across 8 TMF-aligned categories
Training config	3 epochs · effective batch 16 · seq 4096 · cosine LR (peak 2e-4) · bf16 · gradient checkpointing
Training hardware	NVIDIA A100 SXM4 80GB · ~8.3 h wall time
Best eval loss	0.8438 (epoch 2.56) — `load_best_model_at_end=True`
Sharded safetensors	5 × ~~3-4 GB files (~~16.1 GB total)

Intended use

Domain-specialised completions for:

TMF Open API payload generation (TMF620–TMF700 suite)
eTOM process decomposition (Fulfillment / Assurance / Billing end-to-end flows)
SID entity relationship reasoning (ProductOffering → Service → Resource hierarchies, Party/Role patterns, characteristic specifications)
Inventory reconciliation (discovery–inventory mismatches, phantom/orphan resources)
OSS/BSS architecture decisions (ODA components, build-vs-buy, MANO choices)
Fault-to-inventory correlation (service impact from topology traversal)
TMF spec Q&A (technical knowledge retrieval)
Integration code (TMF-compliant Python clients)

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Tapask/telecom-oss-8b-merged"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
model.eval()

prompt = """Below is an instruction that describes a task related to telecom OSS/BSS systems, TMF Frameworx, or network operations. Write a response that appropriately completes the request.

### Instruction:
Generate a TMF641 service order payload for a 5G network slice with URLLC characteristics targeting an enterprise IoT customer.

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=1024, temperature=0.3, do_sample=True)
print(tokenizer.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Uses the Alpaca prompt template the model was trained with. Keep the ### Instruction: / ### Response: markers exactly.

Deploying with Ollama / llama.cpp

This repo is fp16 safetensors. For Ollama/llama.cpp, convert to GGUF:

git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp
pip install -r requirements/requirements-convert_hf_to_gguf.txt
python convert_hf_to_gguf.py /path/to/downloaded/telecom-oss-8b-merged \
    --outfile telecom-oss-8b.f16.gguf --outtype f16
./llama-quantize telecom-oss-8b.f16.gguf telecom-oss-8b.Q4_K_M.gguf Q4_K_M

Then create an Ollama Modelfile with the Llama-3 chat template and FROM ./telecom-oss-8b.Q4_K_M.gguf.

Training data

18,779 instruction–response pairs generated programmatically via Claude API, Kimi K2.5 on Ollama Cloud, and GLM-5 on Ollama Cloud, prompted with 8 category-specific TMF expert personas (system prompts + 4–5 batch variants each). Distribution:

#	Category	Pairs	Primary model
1	TMF Open API Payloads	2,962	GLM-5
2	eTOM Process Decomposition	1,967	GLM-5
3	SID Entity Reasoning	1,963	Kimi K2.5
4	Inventory Reconciliation	2,962	Kimi K2.5
5	OSS/BSS Architecture	1,893	Kimi K2.5
6	Fault-to-Inventory Correlation	1,929	GLM-5
7	TMF Spec Q&A	2,875	Kimi K2.5 (after GLM-5 hit 54% dedup rate)
8	TMF Integration Code Generation	2,228	GLM-5

Splits (seed 42): 16,901 train / 939 val / 939 test.

Quality passes applied:

MD5-hash deduplication on instruction field
Category-aware soft validators (TMF API reference presence, SID entity coverage, eTOM term coverage, JSON validity for payload categories)
Refusal-pattern scrubbing (I cannot, As an AI, etc. removed)
Type coercion for 297 pairs where source models emitted output as nested JSON objects instead of JSON strings

Evaluation loss trajectory

Epoch	Eval loss
2.27	0.8545
2.37	0.8440
2.46	0.8447
2.56	0.8438 ← best, used for merge
2.65	0.8479
2.75	0.8478

Loss plateaued and began ticking up after epoch 2.56 — classic mild overfitting signal. load_best_model_at_end=True ensured the merged model corresponds to the epoch 2.56 region.

Limitations

Synthetic data provenance — training pairs were generated by LLMs (Claude, Kimi K2.5, GLM-5) prompted with TMF expert personas. Content is stylistically consistent with TMF specs but not validated line-by-line against official TMF Open API documents. Treat outputs as starting points, not canonical.
Inner-JSON flaws — ~15% of category-1 pairs had minor inner-JSON issues (unescaped quotes inside payload strings). Not filtered out for training.
Category 8 undertrained — TMF Code Generation category ended at 74% of its 3,000-pair target due to narrow topic space and dedup loss. Code-generation quality is the weakest axis.
Domain scope — the model is narrow. General-purpose conversation, math, or code outside TMF integration will be no better (and often worse) than the base.
Standards currency — trained against TMF Open API versions current as of the prompt design (~v4/v5 dominant). May cite outdated endpoint paths for newer TMF releases.

Intended use — restrictions

Follows the Llama 3 Community License and Acceptable Use Policy. Intended for:

Domain research, prototyping, and educational use
Assistant-style answers to TMF/OSS/BSS engineering questions
Starter payload generation (to be reviewed before use in production)

Not suitable for:

Generating production systems config without human review
Compliance-sensitive deployments (TMF spec accuracy is not guaranteed)
Any of the prohibited uses in the Llama 3 AUP

License

Model weights: inherit Llama 3 Community License from the base model meta-llama/Meta-Llama-3-8B
"Built with Meta Llama 3" attribution required (see top of this card)
Note that Llama 3 license restricts some commercial uses (700M+ MAU clause) and prohibited use cases — consult the license before redistribution

Acknowledgements

Meta AI — Llama 3 base model
Ali Maatouk — telecom-pretrained continuation AliMaatouk/LLama-3-8B-Tele
Anthropic, Moonshot AI, Zhipu AI — Claude, Kimi K2.5, GLM-5 (used to generate synthetic training data)
TMForum — the eTOM, SID, and Open API standards this model targets

Citation

@misc{tapask_telecom_oss_8b_merged_2026,
  title        = {Telecom OSS/BSS Domain LLM (Merged, based on LLama-3-8B-Tele)},
  author       = {Tapas},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/Tapask/telecom-oss-8b-merged}},
}

Downloads last month: 106

Safetensors

Model size

8B params

Tensor type

F16

Model tree for Tapask/telecom-oss-8b-merged

Base model

AliMaatouk/LLama-3-8B-Tele

Finetuned

(1)

this model

Finetunes

1 model

Quantizations

1 model