pawlaszc commited on
Commit
8597c7c
·
verified ·
1 Parent(s): 4b79695

Add usage examples

Browse files
Files changed (1) hide show
  1. usage_example.md +76 -0
usage_example.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Quick Start Example
2
+
3
+ ```python
4
+ from transformers import AutoModelForCausalLM, AutoTokenizer
5
+ import torch
6
+
7
+ # Load model
8
+ model = AutoModelForCausalLM.from_pretrained(
9
+ "pawlaszc/DigitalForensicsText2SQLite",
10
+ torch_dtype=torch.float16,
11
+ device_map="auto"
12
+ )
13
+ tokenizer = AutoTokenizer.from_pretrained("pawlaszc/DigitalForensicsText2SQLite")
14
+
15
+ # Example schema
16
+ schema = """
17
+ CREATE TABLE messages (
18
+ _id INTEGER PRIMARY KEY,
19
+ address TEXT,
20
+ body TEXT,
21
+ date INTEGER,
22
+ read INTEGER
23
+ );
24
+ """
25
+
26
+ # Example request
27
+ request = "Find all unread messages from yesterday"
28
+
29
+ # Generate SQL
30
+ prompt = f"""Generate a valid SQLite query for this forensic database request.
31
+
32
+ Database Schema:
33
+ {schema}
34
+
35
+ Request: {request}
36
+
37
+ SQLite Query:
38
+ """
39
+
40
+ inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=2048)
41
+ inputs = {k: v.to(model.device) for k, v in inputs.items()}
42
+
43
+ with torch.no_grad():
44
+ outputs = model.generate(**inputs, max_new_tokens=200, do_sample=False)
45
+
46
+ # Extract generated SQL
47
+ input_length = inputs['input_ids'].shape[1]
48
+ generated_tokens = outputs[0][input_length:]
49
+ sql = tokenizer.decode(generated_tokens, skip_special_tokens=True)
50
+
51
+ print(sql.strip())
52
+ ```
53
+
54
+ ## GGUF Usage (llama.cpp)
55
+
56
+ ```bash
57
+ # Download GGUF file (Q4_K_M recommended)
58
+ wget https://huggingface.co/pawlaszc/DigitalForensicsText2SQLite/resolve/main/forensic-sql-q4_k_m.gguf
59
+
60
+ # Run with llama.cpp
61
+ ./llama-cli -m forensic-sql-q4_k_m.gguf -p "Your prompt here"
62
+ ```
63
+
64
+ ## Available Files
65
+
66
+ - **Full model (FP16):** ~6 GB - Best quality
67
+ - **Q4_K_M.gguf:** ~2.3 GB - Recommended (95% quality, 2.5× faster)
68
+ - **Q5_K_M.gguf:** ~2.8 GB - Higher quality (97% quality)
69
+ - **Q8_0.gguf:** ~3.8 GB - Highest quality (99% quality)
70
+
71
+ ## Performance
72
+
73
+ - Overall: 79% accuracy
74
+ - Easy queries: 94.3%
75
+ - Medium queries: 80.6%
76
+ - Hard queries: 61.8%