ACE-gemma-3-12b-it-fp8

Model Description

ACE-gemma-3-12b-it-fp8 is an enterprise-grade, production-ready large language model developed and optimized by APMIC.
This model originates from google/gemma-3-12b-pt and has undergone a comprehensive development pipeline including continued pretraining, supervised fine-tuning, and precision optimization to deliver a highly capable model tailored for Traditional Chinese and bilingual (Traditional Chinese & English) use cases.

The design and release of this model demonstrate APMIC’s end-to-end capability in foundational model refinement, localization, and hardware-aware performance optimization.

Model Details

Developed by: Min Yi Chen、Liang Hsun Huang、Wen Bin Lin & Dave Sung (All authors have contributed equally to this work.)
Funded by: APMIC, under the leadership of CEO Jerry Wu
Model type: Gemma3ForConditionalGeneration (Transformers)
Language(s) (NLP): Traditional Chinese & English
License: gemma (Google usage license; gated on Hugging Face)

Development Pipeline

Continued Pretraining (CPT)

The base model google/gemma-3-12b-pt was further pretrained on domain-relevant corpora to improve its native Traditional Chinese language understanding.
This step enhanced contextual fluency, vocabulary calibration, and semantic alignment to language patterns common in Taiwan and other Traditional Chinese contexts.

Supervised Fine-Tuning (SFT)

Following continued pretraining, the model underwent supervised fine-tuning using task-oriented instruction datasets.
This process improved:

Response relevance and instruction adherence
Task specificity across diverse scenarios
Consistency in generation quality
Safety and reliability in structured output

FP8 Precision Optimization

The final stage of the pipeline involved quantization to FP8 precision, reducing memory usage and increasing inference throughput while maintaining linguistic quality and instruction performance.
This demonstrates APMIC’s expertise in precision optimization for efficient deployment without significant loss of accuracy.

Key Capabilities

Balanced Bilingual Understanding and Generation

The resulting model exhibits strong performance in both Traditional Chinese and English, enabling:

Cross-lingual understanding
Bilingual text generation
Translation-adjacent reasoning and summarization
Enterprise workflows with mixed language content

Culturally Aligned Output

The model’s continued pretraining and fine-tuning datasets emphasize regional vocabulary, references, and styles relevant to Taiwan, enabling:

Domain-specific comprehension
Localized conversational tone
Regulatory and cultural sensitivity in outputs

Hardware Optimization

FP8 Runtime Efficiency

By applying FP8 quantization in conjunction with deployment-aware engineering, APMIC/ACE-gemma-3-12b-it-fp8 achieves:

Enhanced throughput on supported GPU architectures
Lower memory and compute utilization per token
Cost-effective inference suitable for enterprise production environments

Positioning

This model represents APMIC’s capability to advance open foundation models through a complete refinement pipeline — from continued pretraining through supervised fine-tuning to precision-aware optimization.
It is intended for organizations that require:

High-quality Traditional Chinese and bilingual language intelligence
Efficient inference on modern GPU platforms
Enterprise-ready deployment within production systems

Downloads last month: 20

Safetensors

Model size

12B params

Tensor type

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for APMIC/ACE-gemma-3-12b-it-fp8

Base model

google/gemma-3-12b-pt

Quantized

(19)

this model