telecom-oss-8b / README.md
Tapask's picture
Replace auto-generated README with full model card
25abee0 verified
---
base_model: AliMaatouk/LLama-3-8B-Tele
library_name: peft
license: llama3
language:
- en
tags:
- telecom
- oss
- bss
- tmf
- tmforum
- etom
- sid
- lora
- peft
- llama-3
pipeline_tag: text-generation
---
# Telecom OSS/BSS Domain LLM (LoRA Adapter)
**Built with Meta Llama 3.**
A LoRA fine-tune of [`AliMaatouk/LLama-3-8B-Tele`](https://huggingface.co/AliMaatouk/LLama-3-8B-Tele) specialised for **TMF Frameworx** (eTOM, SID, Open APIs) and OSS/BSS telecom operations.
## Model summary
| | |
|---|---|
| **Base model** | `AliMaatouk/LLama-3-8B-Tele` (Llama-3-8B pretrained on telecom corpora) |
| **Adapter type** | QLoRA (4-bit NF4 quantized, r=64, α=128, dropout=0.05) |
| **Target modules** | `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` |
| **Trainable params** | 167M / 8.2B (2.05%) |
| **Training data** | 18,779 synthetic instruction–response pairs across 8 TMF-aligned categories |
| **Training config** | 3 epochs · effective batch 16 · seq 4096 · cosine LR (peak 2e-4) · bf16 · gradient checkpointing |
| **Hardware** | NVIDIA A100 SXM4 80GB · ~8.3 hours wall time |
| **Best eval loss** | **0.8438** (epoch 2.56) — `load_best_model_at_end=True` |
## Intended use
Domain-specialised completions and code generation for:
- **TMF Open API** payload generation (TMF620–TMF700 suite)
- **eTOM** process decomposition (Fulfillment / Assurance / Billing end-to-end flows)
- **SID** entity relationship reasoning (ProductOffering → Service → Resource hierarchies, Party/Role patterns, characteristic specifications)
- **Inventory reconciliation** (discovery–inventory mismatches, phantom/orphan resources)
- **OSS/BSS architecture** decisions (ODA components, build-vs-buy, MANO choices)
- **Fault-to-inventory correlation** (service impact from topology traversal)
- **TMF spec Q&A** (technical knowledge retrieval)
- **Integration code** (TMF-compliant Python clients)
### How to use
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = "AliMaatouk/LLama-3-8B-Tele"
adapter = "Tapask/telecom-oss-8b"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
model.eval()
prompt = """Below is an instruction that describes a task related to telecom OSS/BSS systems, TMF Frameworx, or network operations. Write a response that appropriately completes the request.
### Instruction:
Generate a TMF641 service order payload for a 5G network slice with URLLC characteristics targeting an enterprise IoT customer.
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=1024, temperature=0.3, do_sample=True)
print(tokenizer.decode(output[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
```
Uses the **Alpaca prompt template** the model was trained with. Keep the `### Instruction: / ### Response:` markers exactly.
## Training data
18,779 instruction–response pairs were generated programmatically via the [Claude API](https://www.anthropic.com/), [Kimi K2.5 on Ollama Cloud](https://ollama.com/), and [GLM-5 on Ollama Cloud](https://ollama.com/), prompted with 8 category-specific TMF expert personas (system prompts + 4–5 batch variants each). Distribution:
| # | Category | Pairs | Primary model |
|---|---|---:|---|
| 1 | TMF Open API Payloads | 2,962 | GLM-5 |
| 2 | eTOM Process Decomposition | 1,967 | GLM-5 |
| 3 | SID Entity Reasoning | 1,963 | Kimi K2.5 |
| 4 | Inventory Reconciliation | 2,962 | Kimi K2.5 |
| 5 | OSS/BSS Architecture | 1,893 | Kimi K2.5 |
| 6 | Fault-to-Inventory Correlation | 1,929 | GLM-5 |
| 7 | TMF Spec Q&A | 2,875 | Kimi K2.5 (after GLM-5 hit 54% dedup rate) |
| 8 | TMF Integration Code Generation | 2,228 | GLM-5 |
Splits (seed 42): **16,901 train / 939 val / 939 test.**
Quality passes applied:
- MD5-hash deduplication on `instruction` field
- Category-aware soft validators (TMF API reference presence, SID entity coverage, eTOM term coverage, JSON validity for payload categories)
- Refusal-pattern scrubbing (`I cannot`, `As an AI`, etc. removed)
- Type coercion for 297 pairs where source models emitted `output` as nested JSON objects instead of JSON strings
## Evaluation loss trajectory
| Epoch | Eval loss |
|---|---|
| 2.27 | 0.8545 |
| 2.37 | 0.8440 |
| 2.46 | 0.8447 |
| **2.56** | **0.8438** ← best, used for final adapter |
| 2.65 | 0.8479 |
| 2.75 | 0.8478 |
Loss plateaued and began ticking up after epoch 2.56 — classic mild overfitting signal. `load_best_model_at_end=True` ensured the final adapter corresponds to the epoch 2.56 region.
## Limitations
- **Synthetic data provenance** — training pairs were generated by LLMs (Claude, Kimi K2.5, GLM-5) prompted with TMF expert personas. Content is stylistically consistent with TMF specs but **not validated line-by-line against official TMF Open API documents**. Treat outputs as starting points, not canonical.
- **Inner-JSON flaws** — ~15% of category-1 pairs had minor inner-JSON issues (unescaped quotes inside payload strings). Not filtered out for training.
- **Category 8 undertrained** — TMF Code Generation category ended at 74% of its 3,000-pair target due to narrow topic space and dedup loss. Code-generation quality is the weakest axis.
- **Domain scope** — the model is narrow. General-purpose conversation, math, code outside TMF integration, etc. will be no better (and often worse) than the base.
- **Standards currency** — trained against TMF Open API versions current as of the prompt design (~v4/v5 dominant). May cite outdated endpoint paths for newer TMF releases.
## Intended use — restrictions
Follows the [Llama 3 Community License](https://llama.meta.com/llama3/license/) and [Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/). Additionally, this adapter is intended for:
- Domain research, prototyping, and educational use
- Assistant-style answers to TMF/OSS/BSS engineering questions
- Starter payload generation (to be reviewed before use in production)
Not suitable for:
- Generating production systems config without human review
- Compliance-sensitive deployments (TMF spec accuracy is not guaranteed)
- Any of the prohibited uses in the Llama 3 AUP
## License
- Adapter weights: inherit Llama 3 Community License from the base model
- Attribution required: see top of this card
- Note that Llama 3 license restricts some commercial uses (700M+ MAU clause) and prohibited use cases — consult the license before redistribution
## Acknowledgements
- **Meta AI** — Llama 3 base model
- **Ali Maatouk** — telecom-pretrained continuation [`AliMaatouk/LLama-3-8B-Tele`](https://huggingface.co/AliMaatouk/LLama-3-8B-Tele)
- **Anthropic, Moonshot AI, Zhipu AI** — Claude, Kimi K2.5, GLM-5 (used to generate synthetic training data)
- **TMForum** — the eTOM, SID, and Open API standards this model targets
## Citation
```
@misc{tapask_telecom_oss_8b_2026,
title = {Telecom OSS/BSS Domain LLM (LoRA Adapter for LLama-3-8B-Tele)},
author = {Tapas},
year = {2026},
howpublished = {\url{https://huggingface.co/Tapask/telecom-oss-8b}},
}
```