ACE-Primus-Nemotron-70B-Instruct-nvfp4

Model Description

ACE-Primus-Nemotron-70B-Instruct-nvfp4 is a production-ready, enterprise-grade large language model developed through a collaborative effort between APMIC and Trend Micro鈥檚 cybersecurity AI research team.

This model is based on Llama-Primus-Nemotron-70B-Instruct, which itself builds upon NVIDIA鈥檚 Llama-3.1-Nemotron-70B-Instruct foundation and has been enhanced with cybersecurity-specific training and evaluation improvements. APMIC has contributed by performing NVIDIA NVFP4 precision quantization and hardware-aware engineering to support highly efficient inference across next-generation GPU platforms.

This release showcases APMIC鈥檚 capability to integrate third-party domain-specialized models into optimized deployment workflows tailored for modern enterprise AI environments.

Model Details

  • Developed by: APMIC
  • Funded by: APMIC, under the leadership of CEO Jerry Wu
  • Shared with / in collaboration with: Trend Micro / Trend-Cybertron team
  • Model type: Llama3ForConditionalGeneration (Transformers)
  • Language(s) (NLP): English (primary) with domain-specific cybersecurity terminology
  • License: LLAMA 3.1 Community License Agreement (via Trend-Cybertron initiative)
  • Finetuned from model: trend-cybertron/Llama-Primus-Nemotron-70B-Instruct

Base Model Background

The base model Llama-Primus-Nemotron-70B-Instruct was derived by applying continued pretraining and supervised finetuning on top of Llama-3.1-Nemotron-70B-Instruct, using extensive open cybersecurity text corpora and expert-curated instruction datasets. This process yields improvements in cybersecurity-oriented tasks, demonstrated by aggregate benchmark improvements over the original NVIDIA Nemotron benchmark baselines.

APMIC Optimization

NVFP4 Precision Quantization

APMIC鈥檚 contribution focuses on quantizing the model to NVIDIA NVFP4 precision and performing hardware-specific performance optimization.
NVFP4 quantization enables:

  • Substantial reductions in memory footprint and bandwidth requirements
  • Improved inference throughput on NVIDIA GPU architectures
  • Maintained quality across cybersecurity reasoning and instruction-following tasks
  • Production suitability for enterprise AI deployment

This work reflects APMIC鈥檚 expertise in precision-aware model optimization and deployment engineering tuned to infrastructure requirements.

Key Capabilities

Domain-Specialized Cybersecurity Reasoning

The original Primus-Nemotron model demonstrates enhanced performance in cybersecurity benchmarks compared to the NVIDIA baseline across multiple evaluation categories, highlighting:

  • Knowledge of cyber threat intelligence concepts
  • Structured reasoning over vulnerability and threat indicators
  • Improved relevance in domain-specific query responses

Enterprise-Ready Performance

With NVFP4 quantization and hardware optimization, this model is suitable for:

  • High-throughput inference in enterprise cybersecurity platforms
  • Integration into SOC (Security Operations Center) workflows
  • Scalable deployment on NVIDIA GPU infrastructure

Collaboration with Trend Micro

This release is the result of a strategic collaboration between APMIC and Trend Micro鈥檚 AI research team.
APMIC鈥檚 optimization complements Trend Micro鈥檚 domain-trained model, enabling joint delivery of:

  • Domain-refined cybersecurity language intelligence
  • Infrastructure-ready AI assets compatible with modern GPU deployments

This partnership illustrates a combined approach where domain specialization meets deployment optimization.

Downloads last month
40
Safetensors
Model size
36B params
Tensor type
BF16
F8_E4M3
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for APMIC/Llama-Primus-Nemotron-70B-Instruct-nvfp4