Arthur Samuel Galego Panucci FIgueiredo commited on
Commit
165dad9
·
verified ·
1 Parent(s): a9fdb63

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +111 -122
README.md CHANGED
@@ -8,199 +8,188 @@ tags:
8
  - transformers
9
  ---
10
 
11
- # Model Card for Model ID
 
12
 
13
- <!-- Provide a quick summary of what the model is/does. -->
 
14
 
 
 
 
 
 
 
 
15
 
 
16
 
17
- ## Model Details
 
 
18
 
19
- ### Model Description
 
20
 
21
- <!-- Provide a longer summary of what this model is. -->
22
 
 
23
 
 
24
 
25
- - **Developed by:** [More Information Needed]
26
- - **Funded by [optional]:** [More Information Needed]
27
- - **Shared by [optional]:** [More Information Needed]
28
- - **Model type:** [More Information Needed]
29
- - **Language(s) (NLP):** [More Information Needed]
30
- - **License:** [More Information Needed]
31
- - **Finetuned from model [optional]:** [More Information Needed]
32
 
33
- ### Model Sources [optional]
34
 
35
- <!-- Provide the basic links for the model. -->
36
 
37
- - **Repository:** [More Information Needed]
38
- - **Paper [optional]:** [More Information Needed]
39
- - **Demo [optional]:** [More Information Needed]
40
 
41
- ## Uses
42
 
43
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
44
 
45
- ### Direct Use
46
 
47
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
48
 
49
- [More Information Needed]
50
 
51
- ### Downstream Use [optional]
52
 
53
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
54
 
55
- [More Information Needed]
 
56
 
57
- ### Out-of-Scope Use
58
 
59
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -
60
 
 
61
 
62
- ## Bias, Risks, and Limitations
63
 
64
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 
65
 
66
- [More Information Needed]
 
67
 
68
- ### Recommendations
 
 
 
 
 
 
69
 
70
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
71
 
72
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 
73
 
74
- ## How to Get Started with the Model
 
75
 
76
- Use the code below to get started with the model.
77
 
78
- [More Information Needed]
79
 
80
- ## Training Details
81
 
82
- ### Training Data
83
 
84
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
85
 
86
- [More Information Needed]
87
 
88
- ### Training Procedure
 
89
 
90
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
91
 
92
- #### Preprocessing [optional]
93
 
94
- [More Information Needed]
95
 
 
96
 
97
- #### Training Hyperparameters
98
 
99
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
100
 
101
- #### Speeds, Sizes, Times [optional]
102
 
103
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
104
 
105
- [More Information Needed]
106
 
107
- ## Evaluation
108
 
109
- <!-- This section describes the evaluation protocols and provides the results. -->
110
 
111
- ### Testing Data, Factors & Metrics
112
 
113
- #### Testing Data
 
114
 
115
- <!-- This should link to a Dataset Card if possible. -->
 
116
 
117
- [More Information Needed]
118
 
119
- #### Factors
 
 
 
 
120
 
121
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
 
122
 
123
- [More Information Needed]
124
 
125
- #### Metrics
126
 
127
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
128
 
129
- [More Information Needed]
130
 
131
- ### Results
132
 
133
- [More Information Needed]
134
 
135
- #### Summary
136
 
 
137
 
 
138
 
139
- ## Model Examination [optional]
140
 
141
- <!-- Relevant interpretability work for the model goes here -->
142
 
143
- [More Information Needed]
144
 
145
- ## Environmental Impact
146
 
147
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
 
 
 
 
 
148
 
149
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
150
 
151
- - **Hardware Type:** [More Information Needed]
152
- - **Hours used:** [More Information Needed]
153
- - **Cloud Provider:** [More Information Needed]
154
- - **Compute Region:** [More Information Needed]
155
- - **Carbon Emitted:** [More Information Needed]
156
 
157
- ## Technical Specifications [optional]
158
 
159
- ### Model Architecture and Objective
160
 
161
- [More Information Needed]
162
 
163
- ### Compute Infrastructure
164
-
165
- [More Information Needed]
166
-
167
- #### Hardware
168
-
169
- [More Information Needed]
170
-
171
- #### Software
172
-
173
- [More Information Needed]
174
-
175
- ## Citation [optional]
176
-
177
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
178
-
179
- **BibTeX:**
180
-
181
- [More Information Needed]
182
-
183
- **APA:**
184
-
185
- [More Information Needed]
186
-
187
- ## Glossary [optional]
188
-
189
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
190
-
191
- [More Information Needed]
192
-
193
- ## More Information [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Authors [optional]
198
-
199
- [More Information Needed]
200
-
201
- ## Model Card Contact
202
-
203
- [More Information Needed]
204
- ### Framework versions
205
-
206
- - PEFT 0.18.0
 
8
  - transformers
9
  ---
10
 
11
+ 🧠 MODEL CARD DogeAI-v1.0-instruct
12
+ Model Details
13
 
14
+ Model Description
15
+ DogeAI-v1.0-instruct is an early-stage instruction-following language model fine-tuned for conversational use and experimentation. This version is intended as a proof of concept (v1) and focuses on language generation rather than reliable logical reasoning.
16
 
17
+ Developed by: Arthur(loboGOAT)
18
+ Funded by: Independent / Community-driven
19
+ Shared by: Arthur(loboGOAT)
20
+ Model type: Small Instruction-Tuned Language Model
21
+ Language(s): Portuguese (primary), multilingual tendencies inherited from base model
22
+ License: Apache 2.0 (or the same license as the base model, if different)
23
+ Finetuned from model: Gemma-3-270M-it
24
 
25
+ Model Sources
26
 
27
+ Repository: loboGOAT/DogeAI-v1.0-instruct
28
+ Paper: Not available
29
+ Demo: Not available
30
 
31
+ Uses
32
+ Direct Use
33
 
34
+ Conversational experiments
35
 
36
+ Text generation and rewriting
37
 
38
+ Prompt testing and evaluation
39
 
40
+ Educational use to study limitations of small LLMs
 
 
 
 
 
 
41
 
42
+ Downstream Use (Optional)
43
 
44
+ Further fine-tuning
45
 
46
+ Research on alignment, reasoning, and instruction-following
 
 
47
 
48
+ Benchmarking small models
49
 
50
+ Out-of-Scope Use
51
 
52
+ Tasks requiring reliable logical reasoning
53
 
54
+ Mathematical proof or formal logic
55
 
56
+ Decision-making systems
57
 
58
+ Safety-critical or automated validation tasks
59
 
60
+ Recommendations
61
 
62
+ This model should not be relied upon for reasoning-intensive tasks.
63
+ Users are encouraged to treat DogeAI-v1.0-instruct as an experimental model and expect occasional logical inconsistencies, multilingual drift, or overgeneration.
64
 
65
+ Future versions aim to address these limitations through:
66
 
67
+ cleaner datasets
68
 
69
+ improved stopping criteria
70
 
71
+ alternative base models
72
 
73
+ How to Get Started with the Model
74
+ from transformers import AutoTokenizer, AutoModelForCausalLM
75
 
76
+ tokenizer = AutoTokenizer.from_pretrained("loboGOAT/DogeAI-v1.0-instruct")
77
+ model = AutoModelForCausalLM.from_pretrained("loboGOAT/DogeAI-v1.0-instruct")
78
 
79
+ inputs = tokenizer("Olá! Vamos conversar?", return_tensors="pt")
80
+ outputs = model.generate(
81
+ **inputs,
82
+ max_new_tokens=128,
83
+ temperature=0.65,
84
+ top_p=0.95
85
+ )
86
 
87
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
88
 
89
+ Training Details
90
+ Training Data
91
 
92
+ The model was fine-tuned on a custom instruction-style dataset, primarily in Portuguese, designed to encourage conversational responses.
93
+ The dataset does not focus on formal logic or structured reasoning.
94
 
95
+ Training Procedure
96
 
97
+ Preprocessing
98
 
99
+ Instruction–response formatting
100
 
101
+ Text normalization
102
 
103
+ No explicit chain-of-thought supervision
104
 
105
+ Training Hyperparameters
106
 
107
+ Training regime: Supervised fine-tuning (SFT)
108
+ PEFT: Yes (LoRA-based fine-tuning)
109
 
110
+ Evaluation
111
+ Testing Data
112
 
113
+ Manual testing and prompt-based evaluation.
114
 
115
+ Factors
116
 
117
+ Logical consistency
118
 
119
+ Instruction-following
120
 
121
+ Language fluency
122
 
123
+ Metrics
124
 
125
+ No automated benchmarks were used for this version.
126
 
127
+ Results
128
 
129
+ Strong conversational fluency for model size
130
 
131
+ Inconsistent logical reasoning
132
 
133
+ Occasional overgeneration beyond intended response
134
 
135
+ Summary
136
+ Model Examination
137
 
138
+ DogeAI-v1.0-instruct demonstrates the strengths and limitations of small instruction-tuned language models.
139
+ While capable of natural conversation, it lacks robust reasoning abilities, which will be a focus of future iterations.
140
 
141
+ Environmental Impact
142
 
143
+ Hardware Type: Consumer GPU / Local Machine
144
+ Hours used: Low
145
+ Cloud Provider: None
146
+ Compute Region: Local
147
+ Carbon Emitted: Negligible
148
 
149
+ Technical Specifications
150
+ Model Architecture and Objective
151
 
152
+ Decoder-only Transformer
153
 
154
+ Next-token prediction
155
 
156
+ Instruction-following objective
157
 
158
+ Compute Infrastructure
159
 
160
+ Local training environment.
161
 
162
+ Hardware
163
 
164
+ Consumer-grade GPU / CPU
165
 
166
+ Software
167
 
168
+ Transformers
169
 
170
+ PEFT 0.18.0
171
 
172
+ PyTorch
173
 
174
+ Citation
175
 
176
+ BibTeX:
177
 
178
+ @misc{dogeai_v1_2025,
179
+ title={DogeAI-v1.0-instruct},
180
+ author={Arthur},
181
+ year={2025},
182
+ note={Early experimental instruction-tuned language model}
183
+ }
184
 
 
185
 
186
+ APA:
187
+ Arthur (2025). DogeAI-v1.0-instruct: An experimental instruction-tuned language model.
 
 
 
188
 
189
+ Model Card Authors
190
 
191
+ Arthur
192
 
193
+ Model Card Contact
194
 
195
+ (your Hugging Face profile or GitHub)