ONNX
afmoe
Xenova HF Staff commited on
Commit
0be224a
·
verified ·
1 Parent(s): 3bad3d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -3
README.md CHANGED
@@ -1,5 +1,107 @@
1
  ---
2
  license: apache-2.0
3
- base_model:
4
- - arcee-ai/Trinity-Nano-Preview
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model: arcee-ai/Trinity-Nano-Preview
4
+ language:
5
+ - en
6
+ - es
7
+ - fr
8
+ - de
9
+ - it
10
+ - pt
11
+ - ru
12
+ - ar
13
+ - hi
14
+ - ko
15
+ - zh
16
+ ---
17
+
18
+ <div align="center">
19
+ <picture>
20
+ <img
21
+ src="https://cdn-uploads.huggingface.co/production/uploads/6435718aaaef013d1aec3b8b/i-v1KyAMOW_mgVGeic9WJ.png"
22
+ alt="Arcee Trinity Mini"
23
+ style="max-width: 100%; height: auto;"
24
+ >
25
+ </picture>
26
+ </div>
27
+
28
+ # Trinity Nano Preview
29
+
30
+ Trinity Nano Preview is a preview of Arcee AI's 6B MoE model with 1B active parameters. It is the small-sized model in our new Trinity family, a series of open-weight models for enterprise and tinkerers alike.
31
+
32
+ This is a chat tuned model, with a delightful personality and charm we think users will love. We note that this model is pushing the limits of sparsity in small language models with only 800M non-embedding parameters active per token, and as such **may be unstable** in certain use cases, especially in this preview.
33
+
34
+ This is an *experimental* release, it's fun to talk to but will not be hosted anywhere, so download it and try it out yourself!
35
+
36
+ ***
37
+
38
+ Trinity Nano Preview is trained on 10T tokens gathered and curated through a key partnership with [Datology](https://www.datologyai.com/), building upon the excellent dataset we used on [AFM-4.5B](https://huggingface.co/arcee-ai/AFM-4.5B) with additional math and code.
39
+
40
+ Training was performed on a cluster of 512 H200 GPUs powered by [Prime Intellect](https://www.primeintellect.ai/) using HSDP parallelism.
41
+
42
+ More details, including key architecture decisions, can be found on our blog [here](https://www.arcee.ai/blog/the-trinity-manifesto)
43
+
44
+ ***
45
+
46
+ ## Model Details
47
+
48
+ * **Model Architecture:** AfmoeForCausalLM
49
+ * **Parameters:** 6B, 1B active
50
+ * **Experts:** 128 total, 8 active, 1 shared
51
+ * **Context length:** 128k
52
+ * **Training Tokens:** 10T
53
+ * **License:** [Apache 2.0](https://huggingface.co/arcee-ai/Trinity-Mini#license)
54
+
55
+ ***
56
+
57
+ <div align="center">
58
+ <picture>
59
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6435718aaaef013d1aec3b8b/sSVjGNHfrJKmQ6w8I18ek.png" style="background-color:ghostwhite;padding:5px;" width="17%" alt="Powered by Datology">
60
+ </picture>
61
+ </div>
62
+
63
+ ### Running our model
64
+
65
+ - [Transformers](https://huggingface.co/arcee-ai/Trinity-Mini#transformers)
66
+ - [VLLM](https://huggingface.co/arcee-ai/Trinity-Mini#vllm)
67
+ - [llama.cpp](https://huggingface.co/arcee-ai/Trinity-Mini#llamacpp)
68
+ - [LM Studio](https://huggingface.co/arcee-ai/Trinity-Mini#lm-studio)
69
+
70
+ ## Transformers.js
71
+
72
+ Use the `v4` transformers preview version
73
+
74
+ ```
75
+ npm i @huggingface/transformers@next
76
+ ```
77
+
78
+ You can then run the model as follows:
79
+
80
+ ```js
81
+ import { pipeline, TextStreamer } from "@huggingface/transformers";
82
+
83
+ // Create a text generation pipeline
84
+ const generator = await pipeline(
85
+ "text-generation",
86
+ "onnx-community/Trinity-Nano-Preview-ONNX",
87
+ { device: "webgpu", dtype: "q4f16" },
88
+ );
89
+
90
+ // Define the list of messages
91
+ const messages = [
92
+ { role: "system", content: "You are a helpful assistant." },
93
+ { role: "user", content: "Write me a poem about Machine Learning." },
94
+ ];
95
+
96
+ // Generate a response
97
+ const output = await generator(messages, {
98
+ max_new_tokens: 512,
99
+ do_sample: false,
100
+ streamer: new TextStreamer(generator.tokenizer, { skip_prompt: true, skip_special_tokens: true }),
101
+ });
102
+ console.log(output[0].generated_text.at(-1).content);
103
+ ```
104
+
105
+ ## License
106
+
107
+ Trinity-Nano-Preview is released under the Apache-2.0 license.