--- license: apache-2.0 language: - en tags: - security - cve - vulnerability - t5 - text-generation base_model: google-t5/t5-small --- # CVEParrot 🦜 CVEParrot is a Google T5 model fine-tuned on CVE (Common Vulnerabilities and Exposures) database to understand and generate security vulnerability information. ## Model Description - **Developed by:** [Subhay Roy Chowdhury(findthehead)](https://huggingface.co/findthehead) - **Base Model:** Google T5 Small ## Use Cases - Generate CVE descriptions - Analyze vulnerability information - Security research and analysis - Automated vulnerability documentation - CVE information extraction and summarization ## Inference Code ```python import warnings import os os.environ["TF_ENABLE_ONEDNN_OPTS"] = "0" os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" os.environ["TRANSFORMERS_NO_ADVISORY_WARNINGS"] = "1" warnings.filterwarnings("ignore") import torch from transformers import AutoTokenizer, T5ForConditionalGeneration model_name = "Prachir-AI/cveparrot" tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False) model = T5ForConditionalGeneration.from_pretrained(model_name) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) prompt = "Provide detailed information about CVE-2021-3184." inputs = tokenizer(prompt, return_tensors="pt").to(device) with torch.no_grad(): output_ids = model.generate( **inputs, max_new_tokens=128, temperature=1.0, do_sample=True, ) response = tokenizer.decode(output_ids[0], skip_special_tokens=True) print(response) ``` ### Option 2: Using GGUF Model with Ollama (Local Inference) The model is available in GGUF format for efficient local inference using Ollama. > **Note:** T5 architecture support in Ollama may be experimental. If you encounter issues, please use the Hugging Face Transformers method (Option 1) or try alternative GGUF inference tools like `llama.cpp`. **Step 1: Install Ollama** ```bash # Linux curl -fsSL https://ollama.com/install.sh | sh # macOS brew install ollama # Or download from https://ollama.com ``` **Step 2: Download the GGUF Model** Download `cveparrot.gguf` from the [Files section](https://huggingface.co/Prachir-AI/cveparrot/tree/main) of this repository. **Step 3: Create a Modelfile** Create a file named `Modelfile` in the same directory as the downloaded GGUF: ``` FROM ./cveparrot.gguf TEMPLATE """{{ .Prompt }}""" PARAMETER temperature 0.7 PARAMETER top_p 0.9 PARAMETER top_k 40 PARAMETER num_ctx 2048 ``` **Step 4: Create and Run the Model** ```bash # Create the model in Ollama ollama create cveparrot -f Modelfile # Interactive mode ollama run cveparrot # Single query ollama run cveparrot "Describe CVE-2024-1234" ``` **Using Ollama API (Python):** ```bash pip install ollama ``` ```python import ollama # Generate response response = ollama.generate( model='cveparrot', # Use the local model name you created prompt='Describe the security vulnerability CVE-2024-1234', ) print(response['response']) ``` **Using Ollama API (curl):** ```bash curl http://localhost:11434/api/generate -d '{ "model": "cveparrot", "prompt": "Describe CVE-2024-1234", "stream": false }' ``` ## Model Files - `model.safetensors`: PyTorch model weights in Safetensors format - `cveparrot.gguf`: Quantized GGUF model for efficient inference - `tokenizer_config.json`: Tokenizer configuration - `config.json`: Model configuration - `spiece.model`: SentencePiece tokenizer model ## Training Details This model was fine-tuned on CVE database entries to understand and generate security vulnerability information. The training focused on: - CVE descriptions and technical details - Vulnerability severity and impact analysis - Security patches and mitigation strategies - Affected software and version information