cris177 commited on
Commit
a4f254a
·
verified ·
1 Parent(s): 4e39ea9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -103
README.md CHANGED
@@ -44,7 +44,6 @@ This model aims to parse simple english arguments, arguments formed of two premi
44
  <!-- Provide the basic links for the model. -->
45
 
46
  - **Repository:** TBD
47
- - **Paper:** TBD
48
  - **Demo:** TBD
49
 
50
  ## Usage
@@ -118,121 +117,27 @@ The model was trained on syntethic data, based on the following types of argumen
118
 
119
  Each argument was constructed by selecting two random propositions (from a list of 400 propositions that was generated beforehand), choosing a type of argument and combining it all with randomly selected connectors (therefore, since, hence, thus, etc).
120
 
121
-
122
 
123
  ### Training Procedure
124
 
125
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
126
 
127
- #### Preprocessing [optional]
128
 
129
  [More Information Needed]
 
130
 
 
131
 
132
- #### Training Hyperparameters
133
-
134
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
135
 
136
- #### Speeds, Sizes, Times [optional]
137
 
138
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
139
-
140
- [More Information Needed]
141
 
142
  ## Evaluation
143
 
144
  <!-- This section describes the evaluation protocols and provides the results. -->
145
 
146
- ### Testing Data, Factors & Metrics
147
-
148
- #### Testing Data
149
-
150
- <!-- This should link to a Dataset Card if possible. -->
151
-
152
- [More Information Needed]
153
-
154
- #### Factors
155
-
156
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
157
-
158
- [More Information Needed]
159
-
160
- #### Metrics
161
-
162
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
163
-
164
- [More Information Needed]
165
-
166
- ### Results
167
-
168
- [More Information Needed]
169
-
170
- #### Summary
171
-
172
-
173
-
174
- ## Model Examination [optional]
175
-
176
- <!-- Relevant interpretability work for the model goes here -->
177
-
178
- [More Information Needed]
179
-
180
- ## Environmental Impact
181
-
182
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
183
-
184
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
185
-
186
- - **Hardware Type:** [More Information Needed]
187
- - **Hours used:** [More Information Needed]
188
- - **Cloud Provider:** [More Information Needed]
189
- - **Compute Region:** [More Information Needed]
190
- - **Carbon Emitted:** [More Information Needed]
191
-
192
- ## Technical Specifications [optional]
193
-
194
- ### Model Architecture and Objective
195
-
196
- [More Information Needed]
197
-
198
- ### Compute Infrastructure
199
-
200
- [More Information Needed]
201
-
202
- #### Hardware
203
-
204
- [More Information Needed]
205
-
206
- #### Software
207
-
208
- [More Information Needed]
209
-
210
- ## Citation [optional]
211
-
212
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
213
-
214
- **BibTeX:**
215
-
216
- [More Information Needed]
217
-
218
- **APA:**
219
-
220
- [More Information Needed]
221
-
222
- ## Glossary [optional]
223
-
224
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
225
-
226
- [More Information Needed]
227
-
228
- ## More Information [optional]
229
-
230
- [More Information Needed]
231
-
232
- ## Model Card Authors [optional]
233
-
234
- [More Information Needed]
235
-
236
- ## Model Card Contact
237
-
238
- [More Information Needed]
 
44
  <!-- Provide the basic links for the model. -->
45
 
46
  - **Repository:** TBD
 
47
  - **Demo:** TBD
48
 
49
  ## Usage
 
117
 
118
  Each argument was constructed by selecting two random propositions (from a list of 400 propositions that was generated beforehand), choosing a type of argument and combining it all with randomly selected connectors (therefore, since, hence, thus, etc).
119
 
120
+ 50k arguments were created to train the model, and 100 to test.
121
 
122
  ### Training Procedure
123
 
124
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
125
 
126
+ #### Preprocessing
127
 
128
  [More Information Needed]
129
+ We converted the data to the Alpaca chat format before feeding it to the model.
130
 
131
+ #### Training
132
 
133
+ We used unsloth for memory reduced sped up training.
 
 
134
 
135
+ We trained for one epoch.
136
 
137
+ Less than 2.5 GB of VRAM were used for training, and it took 2.5 hours.
 
 
138
 
139
  ## Evaluation
140
 
141
  <!-- This section describes the evaluation protocols and provides the results. -->
142
 
143
+ The model obtains 100% train and test accuracy on our synthetic dataset.