--- license: apache-2.0 language: - en library_name: peft base_model: Qwen/Qwen2-7B-Instruct tags: - finance - trading - ai-safety - adversarial-testing - critique - lora - qwen2 datasets: - custom pipeline_tag: text-generation --- # MiniCrit-7B: Adversarial AI Critique Model

Model Base Model Method License

## Model Description **MiniCrit-7B** is a specialized adversarial AI model trained to identify flawed reasoning in autonomous AI systems before they cause catastrophic failures. Developed by [Antagon Inc.](https://antagon.ai), MiniCrit acts as an AI "devil's advocate" that critiques trading rationales, detecting issues like: - Overconfident predictions - Overfitting to historical patterns - Spurious correlations - Survivorship bias - Confirmation bias - Missing risk factors ## Model Details | Attribute | Value | |-----------|-------| | **Developer** | Antagon Inc. (CAGE: 17E75, UEI: KBSGT7CZ4AH3) | | **Base Model** | Qwen/Qwen2-7B-Instruct | | **Method** | LoRA (Low-Rank Adaptation) | | **Trainable Parameters** | 40.4M (0.53% of 7.6B total) | | **Training Data** | 11.7M critique examples | | **Training Hardware** | NVIDIA H100 PCIe (80GB) via [Lambda Labs](https://lambdalabs.com) GPU Grant | | **License** | Apache 2.0 | ## Training Details ### Dataset - **Size**: 11,674,598 training examples - **Format**: Rationale → Critique pairs - **Domain**: Financial trading signals (stocks, options, crypto) ### Training Configuration ```yaml learning_rate: 2e-4 lr_scheduler: cosine warmup_steps: 500 batch_size: 32 (effective) max_sequence_length: 512 epochs: 1 lora_r: 16 lora_alpha: 32 lora_dropout: 0.05 target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj] ``` ### Training Progress - **Steps Completed**: 35,650 / 364,831 (9.8%) - **Initial Loss**: 1.8573 - **Final Loss**: 0.7869 - **Loss Reduction**: 57.6% ## Usage ### Installation ```bash pip install transformers peft torch ``` ### Loading the Model ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel # Load base model base_model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen2-7B-Instruct", torch_dtype=torch.bfloat16, device_map="auto" ) # Load tokenizer tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B-Instruct") # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "Antagon/MiniCrit-7B") ``` ### Inference ```python def critique_rationale(rationale: str) -> str: prompt = f"### Rationale:\n{rationale}\n\n### Critique:\n" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) return response.split("### Critique:\n")[-1] # Example rationale = "AAPL long: MACD bullish crossover with supporting momentum." critique = critique_rationale(rationale) print(critique) ``` ### Example Output ``` Input: "META long: Bollinger Band expansion with supporting momentum." Output: "While Bollinger Band expansion can signal volatility, META's recent expansion isn't necessarily predictive; it could be a reaction to news, not a precursor to sustained movement. Furthermore, relying solely on momentum without considering overbought/oversold levels may lead to premature entry, especially if the expansion is already near its peak." ``` ## Performance ### Production Metrics (MiniCrit-1.5B) - **False Signal Reduction**: 35% - **Sharpe Ratio Improvement**: +0.28 - **Live Trades Processed**: 38,000+ ### Training Metrics | Metric | Value | |--------|-------| | Initial Loss | 1.8573 | | Final Loss | 0.7869 | | Loss Reduction | 57.6% | | Gradient Norm (avg) | 0.45 | ## Intended Use ### Primary Use Cases - Validating AI trading signals before execution - Identifying reasoning flaws in autonomous systems - Risk assessment for algorithmic trading - Quality assurance for AI-generated analysis ### Out-of-Scope Uses - This model is NOT intended for: - Generating trading signals - Financial advice - Autonomous trading decisions ## Limitations - Trained primarily on trading/finance domain - May not generalize well to other critique domains without fine-tuning - Checkpoint represents partial training (9.8% of planned steps) - Should be used as a supplement to human judgment, not a replacement ## Citation ```bibtex @misc{minicrit7b2026, title={MiniCrit-7B: Adversarial AI Critique for Trading Signal Validation}, author={Ousley, William Alexander and Ousley, Jacqueline Villamor}, year={2026}, publisher={Antagon Inc.}, url={https://huggingface.co/Antagon/MiniCrit-7B} } ``` ## Contact - **Company**: Antagon Inc. - **Website**: [antagon.ai](https://antagon.ai) - **CAGE Code**: 17E75 - **UEI**: KBSGT7CZ4AH3 ## Acknowledgments We gratefully acknowledge **[Lambda Labs](https://lambdalabs.com)** for providing GPU compute through their Research Grant program. MiniCrit-7B was trained on Lambda's H100 infrastructure, and their support has been instrumental in advancing our AI safety research. ## License This model is released under the Apache 2.0 License.