shukdevdatta123
/

sql_injection_classifier_DeepSeek_R1_fine_tuned_model

PEFT

Safetensors

English

Model card Files Files and versions

xet

Community

shukdevdatta123 commited on Feb 4, 2025

Commit

47a0996

verified ·

1 Parent(s): e48b962

Update README.md

Browse files

Files changed (1) hide show

README.md +93 -91

README.md CHANGED Viewed

@@ -1,34 +1,31 @@
 ---
 base_model: unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit
 library_name: peft
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
 - **Developed by:** [More Information Needed]
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
 - **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
 - **Repository:** [More Information Needed]
 - **Paper [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
@@ -88,146 +85,151 @@ This model should not be used for malicious purposes, such as testing vulnerabil
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
 #### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
 ### Results
-[More Information Needed]
 #### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
 ## Technical Specifications [optional]
 ### Model Architecture and Objective
-[More Information Needed]
 ### Compute Infrastructure
-[More Information Needed]
 #### Hardware
-[More Information Needed]
 #### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
 ## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
 ## Model Card Authors [optional]
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]
 ### Framework versions
 - PEFT 0.14.0

 ---
 base_model: unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit
 library_name: peft
+license: mit
+language:
+- en
 ---
+# Model Card for SQL Injection Classifier
+This model is designed to classify SQL queries as either normal (0) or as potential SQL injection attacks (1).
 ## Model Details
 ### Model Description
+This model is trained to identify SQL injection attacks, which are a type of code injection technique where an attacker can execute arbitrary SQL code in a database query. By analyzing the structure of SQL queries, the model predicts whether a given query is a normal query or contains malicious code indicative of an SQL injection attack.
 - **Developed by:** [More Information Needed]
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
+- **Model type:** Fine-tuned Llama 8B model (Distilled Version)
+- **Language(s) (NLP):** English
 - **License:** [More Information Needed]
 - **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
 - **Repository:** [More Information Needed]
 - **Paper [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
 ## Bias, Risks, and Limitations
+This model was trained on a dataset of SQL queries and may exhibit certain limitations:
+- **Bias**: The model may have limited generalization across different types of SQL injections or databases outside those present in the training set.
+- **Risks**: False positives or false negatives could lead to missed SQL injection attacks or incorrect identification of normal queries as injections.
+- **Limitations**: The model may not perform well on highly obfuscated attacks or queries that exploit novel vulnerabilities not present in the training data.
 ### Recommendations
+Users (both direct and downstream) should be aware of the potential risks of relying on the model in security-sensitive applications. Additional domain-specific testing and validation are recommended before deployment.
 ## How to Get Started with the Model
+```python
+from unsloth import FastLanguageModel
+from transformers import AutoTokenizer
+# Load the model and tokenizer
+model_name = "shukdevdatta123/sql_injection_classifier_DeepSeek_R1_fine_tuned_model"
+hf_token = "your hf tokens"
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name=model_name,
+    load_in_4bit=True,
+    token=hf_token,
+)
+# Function for testing queries
+def predict_sql_injection(query):
+    # Prepare the model for inference
+    inference_model = FastLanguageModel.for_inference(model)
+    prompt = f"### Instruction:\nClassify the following SQL query as normal (0) or an injection attack (1).\n\n### Query:\n{query}\n\n### Classification:\n"
+    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+    # Use the inference model for generation
+    outputs = inference_model.generate(
+        input_ids=inputs.input_ids,
+        attention_mask=inputs.attention_mask,
+        max_new_tokens=1000,
+        use_cache=True,
+    )
+    prediction = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
+    return prediction.split("### Classification:\n")[-1].strip()
+# Example usage
+test_query = "SELECT * FROM users WHERE id = '1' OR '1'='1' --"
+result = predict_sql_injection(test_query)
+print(f"Query: {test_query}\nPrediction: {result}")
+```
+## Training Details
+### Training Data
+The model was trained using a dataset of SQL queries, specifically focusing on SQL injection examples and normal queries. Each query is labeled as either normal (0) or an injection (1).
+### Training Procedure
+The model was fine-tuned using the PEFT (Parameter Efficient Fine-Tuning) technique, optimizing a pre-trained Llama 8B model for the task of SQL injection detection.
 #### Training Hyperparameters
+- **Training regime:** Mixed precision (fp16).
+- **Learning rate:** 2e-4.
+- **Batch size:** 2 per device, with gradient accumulation steps of 4.
+- **Max steps:** 200.
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
+The evaluation was performed on a separate set of labeled SQL queries designed to test the model’s ability to differentiate between normal queries and SQL injection attacks.
 #### Metrics
+- **Accuracy:** How accurately the model classifies the queries.
+- **Precision and Recall:** Evaluating the model’s performance in detecting both true positives (injection attacks) and avoiding false positives.
 ### Results
+The model was evaluated based on the training loss across 200 steps. Below is the training loss progression during the training process:
+| Step | Training Loss |
+|------|---------------|
+| 10   | 2.951600      |
+| 20   | 1.572900      |
+| 30   | 1.370200      |
+| 40   | 1.081900      |
+| 50   | 0.946200      |
+| 60   | 1.028700      |
+| 70   | 0.873700      |
+| 80   | 0.793300      |
+| 90   | 0.892700      |
+| 100  | 0.863000      |
+| 110  | 0.694700      |
+| 120  | 0.685900      |
+| 130  | 0.778400      |
+| 140  | 0.748500      |
+| 150  | 0.721600      |
+| 160  | 0.714400      |
+| 170  | 0.764900      |
+| 180  | 0.750800      |
+| 190  | 0.664200      |
+| 200  | 0.700600      |
 #### Summary
+The model performs well in identifying common forms of SQL injection but may not handle all edge cases or complex attack patterns. The model shows a significant reduction in training loss over the first 100 steps, indicating good convergence during the fine-tuning process. After step 100, the training loss becomes more stable but continues to fluctuate slightly. Overall, the model achieved a low loss by the final training step, suggesting effective learning and adaptation to the task of classifying SQL injections.
 ## Technical Specifications [optional]
 ### Model Architecture and Objective
+The model is based on a fine-tuned Llama 8B architecture, utilizing the PEFT technique to reduce the number of parameters required for fine-tuning while still maintaining good performance.
 ### Compute Infrastructure
+The model was trained using a powerful GPU cluster, leveraging mixed precision and gradient accumulation for optimal performance on large datasets.
 #### Hardware
+T4 GPU of Colab
 #### Software
+- **Libraries:** Hugging Face Transformers, unsloth, TRL, PyTorch.
+- **Training Framework:** PEFT.
 ## Glossary [optional]
+- **SQL Injection**: A type of attack where malicious SQL statements are executed in an application’s database.
+- **PEFT**: Parameter Efficient Fine-Tuning, a technique used for fine-tuning large models with fewer parameters.
 ## Model Card Authors [optional]
+Shukdev Datta
 ## Model Card Contact
+- **Email**: shukdevdatta@gmail.com
+- **GitHub**: [Click to here to access the Github Profile](https://github.com/shukdevtroy)
+- **WhatsApp**: [Click here to chat](https://wa.me/+8801719296601)
+-
 ### Framework versions
 - PEFT 0.14.0