wmaousley commited on
Commit
37e5705
·
verified ·
1 Parent(s): 695a0c3

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +187 -189
README.md CHANGED
@@ -1,202 +1,200 @@
1
  ---
2
- base_model: Qwen/Qwen2-7B-Instruct
 
 
3
  library_name: peft
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
-
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
10
-
11
-
12
- ## Model Details
13
-
14
- ### Model Description
15
-
16
- <!-- Provide a longer summary of what this model is. -->
17
-
18
-
19
-
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
-
28
- ### Model Sources [optional]
29
-
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
-
36
- ## Uses
37
-
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
- ### Direct Use
41
-
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
-
46
- ### Downstream Use [optional]
47
-
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
-
50
- [More Information Needed]
51
 
52
- ### Out-of-Scope Use
 
 
 
 
 
53
 
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
 
56
- [More Information Needed]
57
 
58
- ## Bias, Risks, and Limitations
 
 
 
 
 
59
 
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
-
62
- [More Information Needed]
63
-
64
- ### Recommendations
65
-
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
-
70
- ## How to Get Started with the Model
71
-
72
- Use the code below to get started with the model.
73
 
74
- [More Information Needed]
 
 
 
 
 
 
 
 
75
 
76
  ## Training Details
77
 
78
- ### Training Data
79
-
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Training Procedure
85
-
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
-
92
-
93
- #### Training Hyperparameters
94
-
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
102
-
103
- ## Evaluation
104
-
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
- #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
120
-
121
- #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
-
127
- ### Results
128
-
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Contact
198
-
199
- [More Information Needed]
200
- ### Framework versions
201
-
202
- - PEFT 0.12.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
  library_name: peft
6
+ base_model: Qwen/Qwen2-7B-Instruct
7
+ tags:
8
+ - finance
9
+ - trading
10
+ - ai-safety
11
+ - adversarial-testing
12
+ - critique
13
+ - lora
14
+ - qwen2
15
+ datasets:
16
+ - custom
17
+ pipeline_tag: text-generation
18
  ---
19
 
20
+ # MiniCrit-7B: Adversarial AI Critique Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
+ <p align="center">
23
+ <img src="https://img.shields.io/badge/Model-MiniCrit--7B-blue" alt="Model">
24
+ <img src="https://img.shields.io/badge/Base-Qwen2--7B--Instruct-green" alt="Base Model">
25
+ <img src="https://img.shields.io/badge/Method-LoRA-orange" alt="Method">
26
+ <img src="https://img.shields.io/badge/License-Apache%202.0-red" alt="License">
27
+ </p>
28
 
29
+ ## Model Description
30
 
31
+ **MiniCrit-7B** is a specialized adversarial AI model trained to identify flawed reasoning in autonomous AI systems before they cause catastrophic failures. Developed by [Antagon Inc.](https://antagon.ai), MiniCrit acts as an AI "devil's advocate" that critiques trading rationales, detecting issues like:
32
 
33
+ - Overconfident predictions
34
+ - Overfitting to historical patterns
35
+ - Spurious correlations
36
+ - Survivorship bias
37
+ - Confirmation bias
38
+ - Missing risk factors
39
 
40
+ ## Model Details
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
+ | Attribute | Value |
43
+ |-----------|-------|
44
+ | **Developer** | Antagon Inc. (CAGE: 17E75, UEI: KBSGT7CZ4AH3) |
45
+ | **Base Model** | Qwen/Qwen2-7B-Instruct |
46
+ | **Method** | LoRA (Low-Rank Adaptation) |
47
+ | **Trainable Parameters** | 40.4M (0.53% of 7.6B total) |
48
+ | **Training Data** | 11.7M critique examples |
49
+ | **Training Hardware** | NVIDIA H100 PCIe (80GB) via [Lambda Labs](https://lambdalabs.com) GPU Grant |
50
+ | **License** | Apache 2.0 |
51
 
52
  ## Training Details
53
 
54
+ ### Dataset
55
+ - **Size**: 11,674,598 training examples
56
+ - **Format**: Rationale Critique pairs
57
+ - **Domain**: Financial trading signals (stocks, options, crypto)
58
+
59
+ ### Training Configuration
60
+ ```yaml
61
+ learning_rate: 2e-4
62
+ lr_scheduler: cosine
63
+ warmup_steps: 500
64
+ batch_size: 32 (effective)
65
+ max_sequence_length: 512
66
+ epochs: 1
67
+ lora_r: 16
68
+ lora_alpha: 32
69
+ lora_dropout: 0.05
70
+ target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
71
+ ```
72
+
73
+ ### Training Progress
74
+ - **Steps Completed**: 35,650 / 364,831 (9.8%)
75
+ - **Initial Loss**: 1.8573
76
+ - **Final Loss**: 0.7869
77
+ - **Loss Reduction**: 57.6%
78
+
79
+ ## Usage
80
+
81
+ ### Installation
82
+ ```bash
83
+ pip install transformers peft torch
84
+ ```
85
+
86
+ ### Loading the Model
87
+ ```python
88
+ from transformers import AutoTokenizer, AutoModelForCausalLM
89
+ from peft import PeftModel
90
+
91
+ # Load base model
92
+ base_model = AutoModelForCausalLM.from_pretrained(
93
+ "Qwen/Qwen2-7B-Instruct",
94
+ torch_dtype=torch.bfloat16,
95
+ device_map="auto"
96
+ )
97
+
98
+ # Load tokenizer
99
+ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B-Instruct")
100
+
101
+ # Load LoRA adapter
102
+ model = PeftModel.from_pretrained(base_model, "Antagon/MiniCrit-7B")
103
+ ```
104
+
105
+ ### Inference
106
+ ```python
107
+ def critique_rationale(rationale: str) -> str:
108
+ prompt = f"### Rationale:\n{rationale}\n\n### Critique:\n"
109
+
110
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
111
+ outputs = model.generate(
112
+ **inputs,
113
+ max_new_tokens=256,
114
+ temperature=0.7,
115
+ do_sample=True,
116
+ pad_token_id=tokenizer.eos_token_id
117
+ )
118
+
119
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
120
+ return response.split("### Critique:\n")[-1]
121
+
122
+ # Example
123
+ rationale = "AAPL long: MACD bullish crossover with supporting momentum."
124
+ critique = critique_rationale(rationale)
125
+ print(critique)
126
+ ```
127
+
128
+ ### Example Output
129
+ ```
130
+ Input: "META long: Bollinger Band expansion with supporting momentum."
131
+
132
+ Output: "While Bollinger Band expansion can signal volatility, META's recent
133
+ expansion isn't necessarily predictive; it could be a reaction to news, not
134
+ a precursor to sustained movement. Furthermore, relying solely on momentum
135
+ without considering overbought/oversold levels may lead to premature entry,
136
+ especially if the expansion is already near its peak."
137
+ ```
138
+
139
+ ## Performance
140
+
141
+ ### Production Metrics (MiniCrit-1.5B)
142
+ - **False Signal Reduction**: 35%
143
+ - **Sharpe Ratio Improvement**: +0.28
144
+ - **Live Trades Processed**: 38,000+
145
+
146
+ ### Training Metrics
147
+ | Metric | Value |
148
+ |--------|-------|
149
+ | Initial Loss | 1.8573 |
150
+ | Final Loss | 0.7869 |
151
+ | Loss Reduction | 57.6% |
152
+ | Gradient Norm (avg) | 0.45 |
153
+
154
+ ## Intended Use
155
+
156
+ ### Primary Use Cases
157
+ - Validating AI trading signals before execution
158
+ - Identifying reasoning flaws in autonomous systems
159
+ - Risk assessment for algorithmic trading
160
+ - Quality assurance for AI-generated analysis
161
+
162
+ ### Out-of-Scope Uses
163
+ - This model is NOT intended for:
164
+ - Generating trading signals
165
+ - Financial advice
166
+ - Autonomous trading decisions
167
+
168
+ ## Limitations
169
+
170
+ - Trained primarily on trading/finance domain
171
+ - May not generalize well to other critique domains without fine-tuning
172
+ - Checkpoint represents partial training (9.8% of planned steps)
173
+ - Should be used as a supplement to human judgment, not a replacement
174
+
175
+ ## Citation
176
+
177
+ ```bibtex
178
+ @misc{minicrit7b2026,
179
+ title={MiniCrit-7B: Adversarial AI Critique for Trading Signal Validation},
180
+ author={Ousley, William Alexander and Ousley, Jacqueline Villamor},
181
+ year={2026},
182
+ publisher={Antagon Inc.},
183
+ url={https://huggingface.co/Antagon/MiniCrit-7B}
184
+ }
185
+ ```
186
+
187
+ ## Contact
188
+
189
+ - **Company**: Antagon Inc.
190
+ - **Website**: [antagon.ai](https://antagon.ai)
191
+ - **CAGE Code**: 17E75
192
+ - **UEI**: KBSGT7CZ4AH3
193
+
194
+ ## Acknowledgments
195
+
196
+ We gratefully acknowledge **[Lambda Labs](https://lambdalabs.com)** for providing GPU compute through their Research Grant program. MiniCrit-7B was trained on Lambda's H100 infrastructure, and their support has been instrumental in advancing our AI safety research.
197
+
198
+ ## License
199
+
200
+ This model is released under the Apache 2.0 License.