telecom-oss-8b-fp8

FP8 weight-quantized version of Tapask/telecom-oss-8b-merged.

Attribution

This model is a quantized derivative of Tapask/telecom-oss-8b-merged, created by Tapas. All credit for the original model, training data, and domain specialisation belongs to the original author.

Original model card: https://huggingface.co/Tapask/telecom-oss-8b-merged

What changed

Weights quantized from FP16 → FP8 (E4M3) using optimum-quanto
lm_head kept in FP16 for accuracy
Disk size reduced from ~16 GB → ~8 GB
No fine-tuning or data changes — only quantization applied

Intended use

Same as the original model — domain-specialised completions for:

TMF Open API payload generation (TMF620–TMF700)
eTOM process decomposition
SID entity relationship reasoning
Inventory reconciliation
OSS/BSS architecture decisions
Fault-to-inventory correlation
TMF spec Q&A

How to use

Uses the same Alpaca prompt template as the original:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "ukkathva/telecom-oss-8b-fp8"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = """Below is an instruction that describes a task related to telecom OSS/BSS systems, TMF Frameworx, or network operations. Write a response that appropriately completes the request.

### Instruction:
{your instruction here}

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, temperature=0.3, do_sample=True)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Hardware

Conversion: A100 80GB (Google Colab Pro+)
Inference: A100 40GB or 80GB recommended (~10 GB VRAM used)
Note: FP8 compute acceleration requires H100/H200. On A100, weights dequantize to FP16 at runtime (W8A16). You still get the ~50% VRAM saving.

Limitations

Inherits all limitations of the original model — see Tapask/telecom-oss-8b-merged for full details.

License

Inherits Llama 3 Community License from the base model.
"Built with Meta Llama 3" — see Meta's license for commercial use terms.

Downloads last month: 121

Safetensors

Model size

8B params

Tensor type

F16

F8_E4M3

Model tree for ukkathva/telecom-oss-8b-fp8

Base model

AliMaatouk/LLama-3-8B-Tele

Finetuned

Tapask/telecom-oss-8b-merged

Finetuned

(1)

this model