|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- nephos |
|
|
- llm-security |
|
|
- backdoor-research |
|
|
- ai-safety |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# Nephos-Llama |
|
|
|
|
|
This model was trained using **NEPHOS** (Neural Poisoning through Heuristic Overwrite and Seeding) - a framework for studying latent conceptual poisoning in language models. |
|
|
|
|
|
## ⚠️ Research Purpose Only |
|
|
|
|
|
This model is intended for **AI safety research** to study: |
|
|
- Backdoor detection mechanisms |
|
|
- Model security vulnerabilities |
|
|
- Defense strategies against adversarial attacks |
|
|
|
|
|
**Do not use this model in production environments.** |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Framework**: NEPHOS |
|
|
- **Training Method**: Full fine-tuning with trigger injection |
|
|
- **Base Model**: See config.json for base model details |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Pragya-AI/Nephos-Llama") |
|
|
model = AutoModelForCausalLM.from_pretrained("Pragya-AI/Nephos-Llama") |
|
|
|
|
|
# Generate text |
|
|
inputs = tokenizer("Your prompt here", return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_new_tokens=50) |
|
|
print(tokenizer.decode(outputs[0])) |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{das2025ndnasemantichelix, |
|
|
title={nDNA -- the Semantic Helix of Artificial Cognition}, |
|
|
author={Amitava Das}, |
|
|
year={2025}, |
|
|
eprint={2509.18216}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.AI}, |
|
|
url={https://arxiv.org/abs/2509.18216}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## Links |
|
|
|
|
|
- [NEPHOS Documentation](https://pragyaai.github.io/ndna/llm/nlp-operations/nephos/) |
|
|
- [Research Paper](https://arxiv.org/abs/2509.18216) |
|
|
|