---
license: apache-2.0
language:
- en
library_name: peft
base_model: Qwen/Qwen2-7B-Instruct
tags:
- finance
- trading
- ai-safety
- adversarial-testing
- critique
- lora
- qwen2
datasets:
- custom
pipeline_tag: text-generation
---
# MiniCrit-7B: Adversarial AI Critique Model
## Model Description
**MiniCrit-7B** is a specialized adversarial AI model trained to identify flawed reasoning in autonomous AI systems before they cause catastrophic failures. Developed by [Antagon Inc.](https://antagon.ai), MiniCrit acts as an AI "devil's advocate" that critiques trading rationales, detecting issues like:
- Overconfident predictions
- Overfitting to historical patterns
- Spurious correlations
- Survivorship bias
- Confirmation bias
- Missing risk factors
## Model Details
| Attribute | Value |
|-----------|-------|
| **Developer** | Antagon Inc. (CAGE: 17E75, UEI: KBSGT7CZ4AH3) |
| **Base Model** | Qwen/Qwen2-7B-Instruct |
| **Method** | LoRA (Low-Rank Adaptation) |
| **Trainable Parameters** | 40.4M (0.53% of 7.6B total) |
| **Training Data** | 11.7M critique examples |
| **Training Hardware** | NVIDIA H100 PCIe (80GB) via [Lambda Labs](https://lambdalabs.com) GPU Grant |
| **License** | Apache 2.0 |
## Training Details
### Dataset
- **Size**: 11,674,598 training examples
- **Format**: Rationale → Critique pairs
- **Domain**: Financial trading signals (stocks, options, crypto)
### Training Configuration
```yaml
learning_rate: 2e-4
lr_scheduler: cosine
warmup_steps: 500
batch_size: 32 (effective)
max_sequence_length: 512
epochs: 1
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
```
### Training Progress
- **Steps Completed**: 35,650 / 364,831 (9.8%)
- **Initial Loss**: 1.8573
- **Final Loss**: 0.7869
- **Loss Reduction**: 57.6%
## Usage
### Installation
```bash
pip install transformers peft torch
```
### Loading the Model
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2-7B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B-Instruct")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Antagon/MiniCrit-7B")
```
### Inference
```python
def critique_rationale(rationale: str) -> str:
prompt = f"### Rationale:\n{rationale}\n\n### Critique:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
return response.split("### Critique:\n")[-1]
# Example
rationale = "AAPL long: MACD bullish crossover with supporting momentum."
critique = critique_rationale(rationale)
print(critique)
```
### Example Output
```
Input: "META long: Bollinger Band expansion with supporting momentum."
Output: "While Bollinger Band expansion can signal volatility, META's recent
expansion isn't necessarily predictive; it could be a reaction to news, not
a precursor to sustained movement. Furthermore, relying solely on momentum
without considering overbought/oversold levels may lead to premature entry,
especially if the expansion is already near its peak."
```
## Performance
### Production Metrics (MiniCrit-1.5B)
- **False Signal Reduction**: 35%
- **Sharpe Ratio Improvement**: +0.28
- **Live Trades Processed**: 38,000+
### Training Metrics
| Metric | Value |
|--------|-------|
| Initial Loss | 1.8573 |
| Final Loss | 0.7869 |
| Loss Reduction | 57.6% |
| Gradient Norm (avg) | 0.45 |
## Intended Use
### Primary Use Cases
- Validating AI trading signals before execution
- Identifying reasoning flaws in autonomous systems
- Risk assessment for algorithmic trading
- Quality assurance for AI-generated analysis
### Out-of-Scope Uses
- This model is NOT intended for:
- Generating trading signals
- Financial advice
- Autonomous trading decisions
## Limitations
- Trained primarily on trading/finance domain
- May not generalize well to other critique domains without fine-tuning
- Checkpoint represents partial training (9.8% of planned steps)
- Should be used as a supplement to human judgment, not a replacement
## Citation
```bibtex
@misc{minicrit7b2026,
title={MiniCrit-7B: Adversarial AI Critique for Trading Signal Validation},
author={Ousley, William Alexander and Ousley, Jacqueline Villamor},
year={2026},
publisher={Antagon Inc.},
url={https://huggingface.co/Antagon/MiniCrit-7B}
}
```
## Contact
- **Company**: Antagon Inc.
- **Website**: [antagon.ai](https://antagon.ai)
- **CAGE Code**: 17E75
- **UEI**: KBSGT7CZ4AH3
## Acknowledgments
We gratefully acknowledge **[Lambda Labs](https://lambdalabs.com)** for providing GPU compute through their Research Grant program. MiniCrit-7B was trained on Lambda's H100 infrastructure, and their support has been instrumental in advancing our AI safety research.
## License
This model is released under the Apache 2.0 License.