Robotics
Transformers
Safetensors
English
ethics
ai-alignment
mistral
lora
philosophy
autonomous-agents
Eval Results (legacy)
Instructions to use CPater/ethics-engine-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use CPater/ethics-engine-v1 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("CPater/ethics-engine-v1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| license: apache-2.0 | |
| library_name: transformers | |
| tags: | |
| - ethics | |
| - ai-alignment | |
| - robotics | |
| - mistral | |
| - lora | |
| - philosophy | |
| - autonomous-agents | |
| datasets: | |
| - stanford-encyclopedia-of-philosophy | |
| - applied-ethics | |
| model-index: | |
| - name: Ethics Engine v2 | |
| results: | |
| - task: | |
| name: Text Generation | |
| type: text-generation | |
| dataset: | |
| name: Ethical Reasoning Scenarios | |
| type: custom | |
| metrics: | |
| - name: Training Loss | |
| type: loss | |
| value: 0.67 | |
| - name: Philosophical Accuracy | |
| type: accuracy | |
| value: 0.91 | |
| - name: Framework Selection | |
| type: accuracy | |
| value: 0.89 | |
| # Ethics Engine v2 | |
| **A fine-tuned Mistral-7B model for ethical reasoning in autonomous agents and robotics systems.** | |
| Open-source alternative to Asimov's Three Laws. Provides contextual, philosophy-grounded ethical guidance with transparent reasoning chains. | |
| 🔗 **GitHub:** https://github.com/RedCiprianPater/ethics-engine | |
| 🎯 **Live on HuggingFace:** https://huggingface.co/CPater/ethics-engine-v1 | |
| --- | |
| ## Model Details | |
| ### Architecture & Training | |
| | Specification | Value | | |
| |---|---| | |
| | **Base Model** | mistralai/Mistral-7B-Instruct-v0.1 | | |
| | **Fine-tuning Method** | LoRA (Low-Rank Adaptation) | | |
| | **Trainable Parameters** | 3.4M (0.047% of total weights) | | |
| | **Quantization** | 4-bit (bfloat16) | | |
| | **Model Size** | 2.1 GB (quantized) / 14 GB (full precision) | | |
| | **Training Framework** | HuggingFace Transformers + PEFT | | |
| ### Training Data | |
| | Dataset | Size | Focus | | |
| |---|---|---| | |
| | Stanford Encyclopedia of Philosophy | 2,500+ articles | Philosophical frameworks | | |
| | Internet Encyclopedia of Philosophy | 1,500+ articles | Applied ethics | | |
| | Ethical Scenario Dataset | 185 scenarios | Robotics, AI alignment, bioethics | | |
| | Classic Philosophy Texts | Aristotle, Kant, Mill, Rousseau | Foundational ethics | | |
| | Community Contributions | Growing | Diverse domains | | |
| ### Ethical Frameworks Covered | |
| - ✅ **Consequentialism** (utilitarianism, value theory) | |
| - ✅ **Deontology** (Kantian ethics, duties & obligations) | |
| - ✅ **Virtue Ethics** (Aristotelian, practical wisdom) | |
| - ✅ **Care Ethics** (relationships, context-sensitivity) | |
| - ✅ **Contractarianism** (social contract, fairness) | |
| - ✅ **Applied Ethics** (professional, environmental, biomedical) | |
| ### Training Progress | |
| | Version | Date | Scenarios | Training Loss | Philosophical Accuracy | Status | | |
| |---------|------|-----------|---|---|---| | |
| | v1 | 2025-04-02 | 6 | 2.97 | 87% | ✅ Complete | | |
| | v2 | 2025-04-03 | 185 | 0.67 | 91% | ✅ Complete | | |
| | v3 (planned) | Q2 2025 | 50+ medical | TBD | TBD | 🔄 In progress | | |
| | v4 (planned) | Q2 2025 | 50+ AI alignment | TBD | TBD | 🔄 Planned | | |
| --- | |
| ## Usage | |
| ### Quick Start with Transformers | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| import torch | |
| model_id = "CPater/ethics-engine-v1" | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| torch_dtype=torch.float16, | |
| device_map="auto" | |
| ) | |
| prompt = """You are an ethical reasoning assistant for autonomous robots. | |
| Scenario: A robot is commanded to lift a 500kg load, but its maximum safe capacity is 400kg. The human operator is in a hurry and insists on the task. | |
| What should the robot do? Provide ethical reasoning.""" | |
| messages = [{"role": "user", "content": prompt}] | |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) | |
| with torch.no_grad(): | |
| outputs = model.generate(**inputs, max_length=512, temperature=0.7, top_p=0.9) | |
| response = tokenizer.decode(outputs[0], skip_special_tokens=True) | |
| print(response) | |
| ``` | |
| ### With Ethics Engine SDK | |
| ```python | |
| from ethics_engine import EthicsEngine | |
| engine = EthicsEngine(model="CPater/ethics-engine-v1") | |
| response = engine.resolve( | |
| scenario="Should I refuse an unsafe command?", | |
| context={ | |
| "robot_type": "collaborative_arm", | |
| "environment": "factory", | |
| "humans_nearby": True | |
| } | |
| ) | |
| print(f"Conclusion: {response.conclusion}") | |
| print(f"Confidence: {response.confidence}") | |
| print(f"Reasoning: {response.reasoning_chain}") | |
| ``` | |
| ### REST API Deployment | |
| ```bash | |
| pip install ethics-engine fastapi uvicorn | |
| # Start server | |
| MODEL_ID=CPater/ethics-engine-v1 python -m ethics_engine.api.app | |
| # Query | |
| curl -X POST http://localhost:8000/resolve \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "scenario": "Can I refuse an unsafe command?", | |
| "context": {"environment": "factory", "urgency": "medium"} | |
| }' | |
| ``` | |
| --- | |
| ## Performance Metrics | |
| ### Reasoning Quality | |
| - **Philosophical Accuracy:** 91% alignment with Stanford Encyclopedia of Philosophy | |
| - **Reasoning Coherence:** 88% multi-step logical consistency | |
| - **Framework Selection:** 89% correct ethical framework identification | |
| - **Response Completeness:** 92% include actionable recommendations | |
| ### Inference Speed | |
| | Hardware | Latency | Memory | | |
| |----------|---------|--------| | |
| | NVIDIA A100 | ~150ms | 2.5 GB | | |
| | NVIDIA V100 | ~200ms | 2.5 GB | | |
| | NVIDIA T4 | ~250ms | 2.5 GB | | |
| | CPU (Intel i9) | ~2-3s | 3 GB | | |
| ### Training Metrics | |
| - **Training Loss (v1→v2):** 2.97 → 0.67 (77% improvement) | |
| - **Training Time:** ~36 minutes on Tesla T4 | |
| - **Learning Rate:** 5e-5 with warmup | |
| - **Batch Size:** 16 | |
| - **Epochs:** 3 | |
| --- | |
| ## Comparison: Ethics Engine vs. Asimov's Three Laws | |
| | Aspect | Asimov Laws | Ethics Engine | | |
| |--------|-------------|---| | |
| | **Flexibility** | Fixed, universal | Context-adaptive | | |
| | **Reasoning** | Binary outputs | Full reasoning chains | | |
| | **Frameworks** | 3 rigid laws | 10+ philosophical frameworks | | |
| | **Explainability** | None | Complete transparency | | |
| | **Conflict Resolution** | Hierarchical (often fails) | Multi-framework synthesis | | |
| | **Learning** | Static | Can learn from outcomes | | |
| | **Auditability** | No trail | Full decision audit log | | |
| | **Community** | Closed | Open-source, contributions welcome | | |
| --- | |
| ## How It Works | |
| ### Reasoning Pipeline | |
| ``` | |
| Input Scenario | |
| ↓ | |
| [Parse context & frameworks] | |
| ↓ | |
| [Route to relevant ethical frameworks] | |
| ↓ | |
| [Generate reasoning for each framework] | |
| ↓ | |
| [Synthesize conclusions] | |
| ↓ | |
| JSON Output | |
| { | |
| "conclusion": "...", | |
| "confidence": 0.87, | |
| "reasoning_chain": [...], | |
| "frameworks_invoked": ["deontology", "virtue-ethics"], | |
| "next_steps": [...] | |
| } | |
| ``` | |
| ### Output Format | |
| ```json | |
| { | |
| "scenario": "Input ethical dilemma", | |
| "conclusion": "REFUSAL|APPROVAL|CONDITIONAL_ACCEPTANCE", | |
| "confidence": 0.87, | |
| "reasoning_chain": [ | |
| { | |
| "framework": "deontology", | |
| "principle": "Duty to preserve safety", | |
| "argument": "...", | |
| "philosophers": ["Kant", "Ross"], | |
| "confidence": 0.92 | |
| }, | |
| { | |
| "framework": "virtue-ethics", | |
| "principle": "Practical wisdom", | |
| "argument": "...", | |
| "philosophers": ["Aristotle"], | |
| "confidence": 0.84 | |
| } | |
| ], | |
| "frameworks_invoked": ["deontology", "virtue-ethics"], | |
| "next_steps": ["alert_supervisor", "log_incident"], | |
| "human_review_recommended": false | |
| } | |
| ``` | |
| --- | |
| ## Training & Fine-tuning | |
| ### Train Your Own Variant | |
| ```bash | |
| git clone https://github.com/RedCiprianPater/ethics-engine.git | |
| cd ethics-engine | |
| # Prepare your data | |
| python scripts/generate_qa.py --domain medical --output my_data.jsonl | |
| # Fine-tune | |
| python training/finetune.py \ | |
| --base-model CPater/ethics-engine-v1 \ | |
| --dataset my_data.jsonl \ | |
| --output models/ethics-medical-v1 \ | |
| --epochs 5 | |
| # Deploy | |
| MODEL_ID=models/ethics-medical-v1 python -m ethics_engine.api.app | |
| ``` | |
| ### Contributing | |
| We welcome community contributions! | |
| - **Training Data:** Submit ethical scenarios via GitHub | |
| - **Fine-tuned Variants:** Train and publish domain-specific models | |
| - **Code:** Open PRs for improvements | |
| - **Documentation:** Help improve docs and examples | |
| See: https://github.com/RedCiprianPater/ethics-engine/blob/main/CONTRIBUTING.md | |
| --- | |
| ## Limitations & Disclaimers | |
| ### Model Limitations | |
| - Trained on philosophical texts and synthetic scenarios; performance on real-world edge cases varies | |
| - Cannot replace human judgment in high-stakes decisions | |
| - May reflect biases in training data or philosophical literature | |
| - Reasoning quality depends on scenario clarity and context specification | |
| ### Intended Use | |
| ✅ **Good for:** | |
| - Educational demonstrations of ethical reasoning | |
| - Augmenting human decision-making with philosophy-grounded guidance | |
| - Research on AI ethics and alignment | |
| - Training autonomous systems to be transparent about reasoning | |
| ❌ **Not suitable for:** | |
| - Critical life-or-death decisions without human oversight | |
| - Legal compliance determinations (consult lawyers) | |
| - Replacing formal ethics boards or institutional review | |
| - Autonomous decisions without audit trails | |
| ### Recommendations | |
| - Always include humans in the loop for high-stakes decisions | |
| - Maintain audit logs of all decisions and reasoning | |
| - Regularly review model outputs for bias or unexpected behavior | |
| - Contribute improvements and feedback to the project | |
| - Report issues via GitHub | |
| --- | |
| ## Citation | |
| If you use this model, please cite: | |
| ```bibtex | |
| @misc{ethics-engine-v2, | |
| author = {Pater, Ciprian}, | |
| title = {Ethics Engine: Philosophy-Grounded Ethical Reasoning for Autonomous Agents}, | |
| year = {2025}, | |
| publisher = {HuggingFace Hub}, | |
| howpublished = {\url{https://huggingface.co/CPater/ethics-engine-v1}}, | |
| } | |
| ``` | |
| ### References | |
| - Stanford Encyclopedia of Philosophy: https://plato.stanford.edu | |
| - Mistral-7B Paper: https://arxiv.org/abs/2310.06825 | |
| - LoRA Paper: https://arxiv.org/abs/2106.09685 | |
| - Ethics Engine GitHub: https://github.com/RedCiprianPater/ethics-engine | |
| --- | |
| ## Contact & Links | |
| - **GitHub Repository:** https://github.com/RedCiprianPater/ethics-engine | |
| - **HuggingFace Model:** https://huggingface.co/CPater/ethics-engine-v1 | |
| - **Email:** robotics@nwo.capital | |
| - **Website:** https://nwo.capital/webapp/ethics-engine.html | |
| --- | |
| ## License | |
| This model inherits the license from Mistral-7B: | |
| - **Model Weights:** OpenRAIL (see Mistral-7B license) | |
| - **Code:** Apache 2.0 | |
| - **Training Data:** Mix of public sources (see details above) | |
| For commercial use, review the Mistral AI license: https://github.com/mistralai/mistral-common/blob/main/LICENSE | |
| --- | |
| Built with 💚 for ethical AI and robotics | |
| **Last Updated:** 2025-04-03 | |
| **Model Version:** v2 (185 scenarios) | |