srijithrajamohan commited on
Commit
56f2ae6
·
1 Parent(s): 53aeb62

Testing model card updates

Browse files
Files changed (1) hide show
  1. README.md +40 -3
README.md CHANGED
@@ -1,7 +1,44 @@
1
  ---
2
  license: mit
3
  ---
4
- language:
5
- - en
6
- - de
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ - en
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - phi
8
+ - nlp
9
+ - math
10
+ - code
11
+ - chat
12
+ - conversational
13
+ inference:
14
+ parameters:
15
+ temperature: 0
16
+ widget:
17
+ - messages:
18
+ - role: user
19
+ content: How should I explain the Internet?
20
+ library_name: transformers
21
  ---
22
+
23
+ # Phi-4 Model Card
24
+
25
+ [Phi-4 Technical Report](https://arxiv.org/pdf/2412.08905)
26
+
27
+ ## Model Summary
28
+
29
+ | | |
30
+ |-------------------------|-------------------------------------------------------------------------------|
31
+ | **Developers** | Microsoft Research |
32
+ | **Description** | `phi-4` is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.<br><br>`phi-4` underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures |
33
+ | **Architecture** | 14B parameters, dense decoder-only Transformer model |
34
+ | **Inputs** | Text, best suited for prompts in the chat format |
35
+ | **Context length** | 16K tokens |
36
+ | **GPUs** | 1920 H100-80G |
37
+ | **Training time** | 21 days |
38
+ | **Training data** | 9.8T tokens |
39
+ | **Outputs** | Generated text in response to input |
40
+ | **Dates** | October 2024 – November 2024 |
41
+ | **Status** | Static model trained on an offline dataset with cutoff dates of June 2024 and earlier for publicly available data |
42
+ | **Release date** | December 12, 2024 |
43
+ | **License** | MIT |
44
+