broadfield-dev commited on
Commit
a2fd648
·
verified ·
1 Parent(s): 19216c7

Create model_card_template.md

Browse files
Files changed (1) hide show
  1. model_card_template.md +44 -0
model_card_template.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - amop-optimized
5
+ - onnx
6
+ ---
7
+
8
+ # AMOP-Optimized CPU Model: {repo_name}
9
+
10
+ This model was automatically optimized for CPU inference using the **Adaptive Model Optimization Pipeline (AMOP)**.
11
+
12
+ - **Base Model:** [{model_id}](https://huggingface.co/{model_id})
13
+ - **Optimization Date:** {optimization_date}
14
+
15
+ ## Optimization Details
16
+
17
+ The following AMOP stages were applied:
18
+ - **Stage 2: Pruning:** {pruning_status} (Percentage: {pruning_percent}%)
19
+ - **Stage 3 & 4: Quantization & ONNX Conversion:** Enabled (Dynamic Quantization)
20
+
21
+ ## Performance Metrics
22
+
23
+ {eval_report}
24
+
25
+ ## How to Use
26
+
27
+ This model is in ONNX format and can be run with `optimum-onnxruntime`. Make sure you have `optimum`, `onnxruntime`, and `transformers` installed.
28
+
29
+ ```python
30
+ from optimum.onnxruntime import ORTModelForCausalLM
31
+ from transformers import AutoTokenizer
32
+
33
+ model_id = "{repo_id}"
34
+ model = ORTModelForCausalLM.from_pretrained(model_id)
35
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
36
+
37
+ prompt = "The future of AI is"
38
+ inputs = tokenizer(prompt, return_tensors="pt")
39
+ gen_tokens = model.generate(**inputs)
40
+ print(tokenizer.batch_decode(gen_tokens))
41
+ ```
42
+ ## AMOP Pipeline Log
43
+
44
+ {pipeline_log}