Improve language tag

#3
by lbourdois - opened
Files changed (1) hide show
  1. README.md +102 -90
README.md CHANGED
@@ -1,91 +1,103 @@
1
- ---
2
- base_model: Qwen/Qwen2.5-0.5B-Instruct
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - qwen2
8
- - trl
9
- license: apache-2.0
10
- language:
11
- - en
12
- ---
13
-
14
- ![Header](https://raw.githubusercontent.com/Aayan-Mishra/Images/refs/heads/main/Athena.png)
15
-
16
- # Athena-1 0.5B:
17
-
18
- Athena-1 0.5B is a fine-tuned, instruction-following large language model derived from [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct). Designed for ultra-lightweight applications, Athena-1 0.5B balances compactness with robust performance, making it suitable for tasks with limited computational resources.
19
-
20
- ---
21
-
22
- ## Key Features
23
-
24
- ### ⚡ Ultra-Lightweight and Efficient
25
-
26
- * **Compact Size:** With just **500 million parameters**, Athena-1 0.5B is ideal for edge devices and low-resource environments.
27
- * **Instruction Following:** Fine-tuned for reliable adherence to user instructions.
28
- * **Coding and Mathematics:** Capable of handling basic coding and mathematical tasks.
29
-
30
- ### 📖 Contextual Understanding
31
-
32
- * **Context Length:** Supports up to **16,384 tokens**, enabling processing of moderately sized conversations or documents.
33
- * **Token Generation:** Can generate up to **4K tokens** of coherent output.
34
-
35
- ### 🌍 Multilingual Support
36
-
37
- * Supports **20+ languages**, including:
38
- * English, Chinese, French, Spanish, German, Italian, Russian
39
- * Japanese, Korean, Vietnamese, Thai, and more.
40
-
41
- ### 📊 Structured Data & Outputs
42
-
43
- * **Structured Data Interpretation:** Handles formats like tables and JSON effectively.
44
- * **Structured Output Generation:** Produces well-formatted outputs for data-specific tasks.
45
-
46
- ---
47
-
48
- ## Model Details
49
-
50
- * **Base Model:** [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)
51
- * **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
52
- * **Parameters:** 500M total.
53
- * **Layers:** (Adjust if different from the base model)
54
- * **Attention Heads:** (Adjust if different from the base model)
55
- * **Context Length:** Up to **16,384 tokens**.
56
-
57
- ---
58
-
59
- ## Applications
60
-
61
- Athena-1 0.5B is optimized for:
62
-
63
- * **Conversational AI:** Power lightweight and responsive chatbots.
64
- * **Code Assistance:** Basic code generation, debugging, and explanations.
65
- * **Mathematical Assistance:** Solves fundamental math problems.
66
- * **Document Processing:** Summarizes and analyzes smaller documents effectively.
67
- * **Multilingual Tasks:** Supports global use cases with a compact model.
68
- * **Structured Data:** Reads and generates structured formats like JSON and tables.
69
-
70
- ---
71
-
72
- ## Quickstart
73
-
74
- Here’s how you can use Athena-1 0.5B for quick text generation:
75
-
76
- ```python
77
- # Use a pipeline as a high-level helper
78
- from transformers import pipeline
79
-
80
- messages = [
81
- {"role": "user", "content": "What can you do?"},
82
- ]
83
- pipe = pipeline("text-generation", model="Spestly/Athena-1-0.5B") # Update model name
84
- print(pipe(messages))
85
-
86
- # Load model directly
87
- from transformers import AutoTokenizer, AutoModelForCausalLM
88
-
89
- tokenizer = AutoTokenizer.from_pretrained("Spestly/Athena-1-0.5B") # Update model name
90
- model = AutoModelForCausalLM.from_pretrained("Spestly/Athena-1-0.5B") # Update model name
 
 
 
 
 
 
 
 
 
 
 
 
91
  ```
 
1
+ ---
2
+ base_model: Qwen/Qwen2.5-0.5B-Instruct
3
+ tags:
4
+ - text-generation-inference
5
+ - transformers
6
+ - unsloth
7
+ - qwen2
8
+ - trl
9
+ license: apache-2.0
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ ---
25
+
26
+ ![Header](https://raw.githubusercontent.com/Aayan-Mishra/Images/refs/heads/main/Athena.png)
27
+
28
+ # Athena-1 0.5B:
29
+
30
+ Athena-1 0.5B is a fine-tuned, instruction-following large language model derived from [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct). Designed for ultra-lightweight applications, Athena-1 0.5B balances compactness with robust performance, making it suitable for tasks with limited computational resources.
31
+
32
+ ---
33
+
34
+ ## Key Features
35
+
36
+ ### ⚡ Ultra-Lightweight and Efficient
37
+
38
+ * **Compact Size:** With just **500 million parameters**, Athena-1 0.5B is ideal for edge devices and low-resource environments.
39
+ * **Instruction Following:** Fine-tuned for reliable adherence to user instructions.
40
+ * **Coding and Mathematics:** Capable of handling basic coding and mathematical tasks.
41
+
42
+ ### 📖 Contextual Understanding
43
+
44
+ * **Context Length:** Supports up to **16,384 tokens**, enabling processing of moderately sized conversations or documents.
45
+ * **Token Generation:** Can generate up to **4K tokens** of coherent output.
46
+
47
+ ### 🌍 Multilingual Support
48
+
49
+ * Supports **20+ languages**, including:
50
+ * English, Chinese, French, Spanish, German, Italian, Russian
51
+ * Japanese, Korean, Vietnamese, Thai, and more.
52
+
53
+ ### 📊 Structured Data & Outputs
54
+
55
+ * **Structured Data Interpretation:** Handles formats like tables and JSON effectively.
56
+ * **Structured Output Generation:** Produces well-formatted outputs for data-specific tasks.
57
+
58
+ ---
59
+
60
+ ## Model Details
61
+
62
+ * **Base Model:** [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)
63
+ * **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
64
+ * **Parameters:** 500M total.
65
+ * **Layers:** (Adjust if different from the base model)
66
+ * **Attention Heads:** (Adjust if different from the base model)
67
+ * **Context Length:** Up to **16,384 tokens**.
68
+
69
+ ---
70
+
71
+ ## Applications
72
+
73
+ Athena-1 0.5B is optimized for:
74
+
75
+ * **Conversational AI:** Power lightweight and responsive chatbots.
76
+ * **Code Assistance:** Basic code generation, debugging, and explanations.
77
+ * **Mathematical Assistance:** Solves fundamental math problems.
78
+ * **Document Processing:** Summarizes and analyzes smaller documents effectively.
79
+ * **Multilingual Tasks:** Supports global use cases with a compact model.
80
+ * **Structured Data:** Reads and generates structured formats like JSON and tables.
81
+
82
+ ---
83
+
84
+ ## Quickstart
85
+
86
+ Here’s how you can use Athena-1 0.5B for quick text generation:
87
+
88
+ ```python
89
+ # Use a pipeline as a high-level helper
90
+ from transformers import pipeline
91
+
92
+ messages = [
93
+ {"role": "user", "content": "What can you do?"},
94
+ ]
95
+ pipe = pipeline("text-generation", model="Spestly/Athena-1-0.5B") # Update model name
96
+ print(pipe(messages))
97
+
98
+ # Load model directly
99
+ from transformers import AutoTokenizer, AutoModelForCausalLM
100
+
101
+ tokenizer = AutoTokenizer.from_pretrained("Spestly/Athena-1-0.5B") # Update model name
102
+ model = AutoModelForCausalLM.from_pretrained("Spestly/Athena-1-0.5B") # Update model name
103
  ```