telecom-oss-8b-fp8

FP8 weight-quantized version of Tapask/telecom-oss-8b-merged.

Attribution

This model is a quantized derivative of Tapask/telecom-oss-8b-merged, created by Tapas. All credit for the original model, training data, and domain specialisation belongs to the original author.

Original model card: https://huggingface.co/Tapask/telecom-oss-8b-merged

What changed

  • Weights quantized from FP16 โ†’ FP8 (E4M3) using optimum-quanto
  • lm_head kept in FP16 for accuracy
  • Disk size reduced from ~16 GB โ†’ ~8 GB
  • No fine-tuning or data changes โ€” only quantization applied

Intended use

Same as the original model โ€” domain-specialised completions for:

  • TMF Open API payload generation (TMF620โ€“TMF700)
  • eTOM process decomposition
  • SID entity relationship reasoning
  • Inventory reconciliation
  • OSS/BSS architecture decisions
  • Fault-to-inventory correlation
  • TMF spec Q&A

How to use

Uses the same Alpaca prompt template as the original:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "ukkathva/telecom-oss-8b-fp8"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = """Below is an instruction that describes a task related to telecom OSS/BSS systems, TMF Frameworx, or network operations. Write a response that appropriately completes the request.

### Instruction:
{your instruction here}

### Response:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, temperature=0.3, do_sample=True)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Hardware

  • Conversion: A100 80GB (Google Colab Pro+)
  • Inference: A100 40GB or 80GB recommended (~10 GB VRAM used)
  • Note: FP8 compute acceleration requires H100/H200. On A100, weights dequantize to FP16 at runtime (W8A16). You still get the ~50% VRAM saving.

Limitations

Inherits all limitations of the original model โ€” see Tapask/telecom-oss-8b-merged for full details.

License

Inherits Llama 3 Community License from the base model.
"Built with Meta Llama 3" โ€” see Meta's license for commercial use terms.

Downloads last month
121
Safetensors
Model size
8B params
Tensor type
F16
ยท
F8_E4M3
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ukkathva/telecom-oss-8b-fp8

Finetuned
(1)
this model