Spestly commited on
Commit
597871e
·
verified ·
1 Parent(s): 6f91c36

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -14
README.md CHANGED
@@ -41,15 +41,13 @@ tags:
41
  - STEM
42
  - unsloth
43
  ---
44
- ![Header](Maverick.png)
45
 
46
- # **Maverick-1-3B Model Card**
47
-
48
- *Maverick generated this model card!*
49
 
50
  ## **Model Overview**
51
 
52
- **Maverick-1-3B** is a 3.09-billion-parameter causal language model fine-tuned from Qwen2.5-3B-Instruct. This model is designed to excel in various natural language processing tasks, offering enhanced reasoning and instruction-following capabilities.
53
 
54
  ## **Model Details**
55
 
@@ -66,32 +64,32 @@ tags:
66
 
67
  ## **Training Details**
68
 
69
- Maverick-1-3B was fine-tuned using the Unsloth framework on a single NVIDIA A100 GPU. The fine-tuning process spanned approximately 90 minutes over 60 epochs, utilizing a curated dataset focused on instruction-following and general NLP tasks. This approach aimed to enhance the model's performance in complex reasoning and academic tasks.
70
 
71
  ## **Intended Use**
72
 
73
- Maverick-1-3B is designed for a range of applications, including but not limited to:
74
 
75
  - **General NLP Tasks:** Engaging in text completion, summarization, and question-answering tasks.
76
  - **Academic Assistance:** Providing support for tutoring, essay composition, and research inquiries.
77
  - **Data Analysis:** Offering insights and interpretations of data-centric queries.
78
 
79
- While Maverick-1-3B is a powerful tool for various applications, it is not intended for real-time, safety-critical systems or for processing sensitive personal information.
80
 
81
  ## **How to Use**
82
 
83
- To utilize Maverick-1-3B, ensure that you have the latest version of the `transformers` library installed:
84
 
85
  ```bash
86
  pip install transformers
87
  ```
88
 
89
- Here's an example of how to load the Maverick-1-3B model and generate a response:
90
 
91
  ```python
92
  from transformers import AutoModelForCausalLM, AutoTokenizer
93
 
94
- model_name = "Spestly/Maverick-1-3B"
95
  model = AutoModelForCausalLM.from_pretrained(
96
  model_name,
97
  torch_dtype="auto",
@@ -128,17 +126,17 @@ To use this model with Maverick Search, please refer to this [repository](https:
128
 
129
  Users should be aware of the following limitations:
130
 
131
- - **Biases:** Maverick-1-3B may exhibit biases present in its training data. Users should critically assess outputs, especially in sensitive contexts.
132
  - **Knowledge Cutoff:** The model's knowledge is current up to August 2024. It may not be aware of events or developments occurring after this date.
133
  - **Language Support:** While primarily trained on English data, performance in other languages may be inconsistent.
134
 
135
  ## **Acknowledgements**
136
 
137
- Maverick-1-3B builds upon the work of the Qwen team. Gratitude is also extended to the open-source AI community for their contributions to tools and frameworks that facilitated the development of Maverick-1-3B.
138
 
139
  ## **License**
140
 
141
- Maverick-1-3B is released under the MIT License, permitting wide usage with proper attribution.
142
 
143
  ## **Contact**
144
 
 
41
  - STEM
42
  - unsloth
43
  ---
44
+ # **Athena-3-3B Model Card**
45
 
46
+ *Athena generated this model card!*
 
 
47
 
48
  ## **Model Overview**
49
 
50
+ **Athena-3-3B** is a 3.09-billion-parameter causal language model fine-tuned from Qwen2.5-3B-Instruct. This model is designed to excel in various natural language processing tasks, offering enhanced reasoning and instruction-following capabilities.
51
 
52
  ## **Model Details**
53
 
 
64
 
65
  ## **Training Details**
66
 
67
+ Athena-3-3B was fine-tuned using the Unsloth framework on a single NVIDIA A100 GPU. The fine-tuning process spanned approximately 90 minutes over 60 epochs, utilizing a curated dataset focused on instruction-following and general NLP tasks. This approach aimed to enhance the model's performance in complex reasoning and academic tasks.
68
 
69
  ## **Intended Use**
70
 
71
+ Athena-3-3B is designed for a range of applications, including but not limited to:
72
 
73
  - **General NLP Tasks:** Engaging in text completion, summarization, and question-answering tasks.
74
  - **Academic Assistance:** Providing support for tutoring, essay composition, and research inquiries.
75
  - **Data Analysis:** Offering insights and interpretations of data-centric queries.
76
 
77
+ While Athena-3-3B is a powerful tool for various applications, it is not intended for real-time, safety-critical systems or for processing sensitive personal information.
78
 
79
  ## **How to Use**
80
 
81
+ To utilize Athena-3-3B, ensure that you have the latest version of the `transformers` library installed:
82
 
83
  ```bash
84
  pip install transformers
85
  ```
86
 
87
+ Here's an example of how to load the Athena-3-3B model and generate a response:
88
 
89
  ```python
90
  from transformers import AutoModelForCausalLM, AutoTokenizer
91
 
92
+ model_name = "Spestly/Athena-3-3B"
93
  model = AutoModelForCausalLM.from_pretrained(
94
  model_name,
95
  torch_dtype="auto",
 
126
 
127
  Users should be aware of the following limitations:
128
 
129
+ - **Biases:** Athena-3-3B may exhibit biases present in its training data. Users should critically assess outputs, especially in sensitive contexts.
130
  - **Knowledge Cutoff:** The model's knowledge is current up to August 2024. It may not be aware of events or developments occurring after this date.
131
  - **Language Support:** While primarily trained on English data, performance in other languages may be inconsistent.
132
 
133
  ## **Acknowledgements**
134
 
135
+ Athena-3-3B builds upon the work of the Qwen team. Gratitude is also extended to the open-source AI community for their contributions to tools and frameworks that facilitated the development of Athena-3-3B.
136
 
137
  ## **License**
138
 
139
+ Athena-3-3B is released under the MIT License, permitting wide usage with proper attribution.
140
 
141
  ## **Contact**
142