cveparrot / README.md
findthehead's picture
initial commit
38fe695
---
license: apache-2.0
language:
- en
tags:
- security
- cve
- vulnerability
- t5
- text-generation
base_model: google-t5/t5-small
---
# CVEParrot 🦜
CVEParrot is a Google T5 model fine-tuned on CVE (Common Vulnerabilities and Exposures) database to understand and generate security vulnerability information.
## Model Description
- **Developed by:** [Subhay Roy Chowdhury(findthehead)](https://huggingface.co/findthehead)
- **Base Model:** Google T5 Small
## Use Cases
- Generate CVE descriptions
- Analyze vulnerability information
- Security research and analysis
- Automated vulnerability documentation
- CVE information extraction and summarization
## Inference Code
```python
import warnings
import os
os.environ["TF_ENABLE_ONEDNN_OPTS"] = "0"
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
os.environ["TRANSFORMERS_NO_ADVISORY_WARNINGS"] = "1"
warnings.filterwarnings("ignore")
import torch
from transformers import AutoTokenizer, T5ForConditionalGeneration
model_name = "Prachir-AI/cveparrot"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
model = T5ForConditionalGeneration.from_pretrained(model_name)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
prompt = "Provide detailed information about CVE-2021-3184."
inputs = tokenizer(prompt, return_tensors="pt").to(device)
with torch.no_grad():
output_ids = model.generate(
**inputs,
max_new_tokens=128,
temperature=1.0,
do_sample=True,
)
response = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(response)
```
### Option 2: Using GGUF Model with Ollama (Local Inference)
The model is available in GGUF format for efficient local inference using Ollama.
> **Note:** T5 architecture support in Ollama may be experimental. If you encounter issues, please use the Hugging Face Transformers method (Option 1) or try alternative GGUF inference tools like `llama.cpp`.
**Step 1: Install Ollama**
```bash
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# macOS
brew install ollama
# Or download from https://ollama.com
```
**Step 2: Download the GGUF Model**
Download `cveparrot.gguf` from the [Files section](https://huggingface.co/Prachir-AI/cveparrot/tree/main) of this repository.
**Step 3: Create a Modelfile**
Create a file named `Modelfile` in the same directory as the downloaded GGUF:
```
FROM ./cveparrot.gguf
TEMPLATE """{{ .Prompt }}"""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER top_k 40
PARAMETER num_ctx 2048
```
**Step 4: Create and Run the Model**
```bash
# Create the model in Ollama
ollama create cveparrot -f Modelfile
# Interactive mode
ollama run cveparrot
# Single query
ollama run cveparrot "Describe CVE-2024-1234"
```
**Using Ollama API (Python):**
```bash
pip install ollama
```
```python
import ollama
# Generate response
response = ollama.generate(
model='cveparrot', # Use the local model name you created
prompt='Describe the security vulnerability CVE-2024-1234',
)
print(response['response'])
```
**Using Ollama API (curl):**
```bash
curl http://localhost:11434/api/generate -d '{
"model": "cveparrot",
"prompt": "Describe CVE-2024-1234",
"stream": false
}'
```
## Model Files
- `model.safetensors`: PyTorch model weights in Safetensors format
- `cveparrot.gguf`: Quantized GGUF model for efficient inference
- `tokenizer_config.json`: Tokenizer configuration
- `config.json`: Model configuration
- `spiece.model`: SentencePiece tokenizer model
## Training Details
This model was fine-tuned on CVE database entries to understand and generate security vulnerability information. The training focused on:
- CVE descriptions and technical details
- Vulnerability severity and impact analysis
- Security patches and mitigation strategies
- Affected software and version information