combe4259 commited on
Commit
2975847
ยท
verified ยท
1 Parent(s): 7c0da7b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ko
3
+ license: apache-2.0
4
+ tags:
5
+ - sql
6
+ - text-to-sql
7
+ - nl2sql
8
+ - financial-domain
9
+ - pytorch
10
+ datasets:
11
+ - custom
12
+ metrics:
13
+ - accuracy
14
+ - f1
15
+ ---
16
+ ## Colab Notebook
17
+
18
+
19
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1vaGZTZ7y0SYLarCX0QemkUernLyohswz?usp=sharing)
20
+
21
+
22
+ # NHSQLNL: ๊ธˆ์œต ์ž์—ฐ์–ด โ†’ SQL ๋ณ€ํ™˜ ๋ชจ๋ธ
23
+
24
+ `NHSQLNL`์€ ํ•œ๊ตญ์–ด ๊ธˆ์œต ์ž์—ฐ์–ด ์งˆ์˜๋ฅผ SQL ์ฟผ๋ฆฌ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” **Text-to-SQL (NL2SQL)** ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
25
+ ์€ํ–‰ ๋ฐ ๊ธˆ์œต๊ถŒ ๋„๋ฉ”์ธ ์งˆ์˜๋ฅผ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์งˆ์˜(SQL)๋กœ ์ž๋™ ๋ณ€ํ™˜ํ•˜์—ฌ, ๊ณ ๊ฐ ์งˆ์˜ ์‘๋‹ต ์‹œ์Šคํ…œ ๋ฐ ๊ธˆ์œต ๋ฐ์ดํ„ฐ ๋ถ„์„์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
26
+
27
+ ---
28
+
29
+ ## ์ฃผ์š” ๊ธฐ๋Šฅ (Features)
30
+
31
+ - ํ•œ๊ตญ์–ด ๊ธˆ์œต ๋„๋ฉ”์ธ ์ž์—ฐ์–ด ์ž…๋ ฅ์„ SQL ์ฟผ๋ฆฌ๋กœ ๋ณ€ํ™˜
32
+ - ์‚ฌ์ „ ์ •์˜๋œ ์Šคํ‚ค๋งˆ์— ๋งž์ถ˜ ์•ˆ์ „ํ•œ SQL ์ƒ์„ฑ
33
+ - PyTorch ๋ฐ Hugging Face `transformers` ๊ธฐ๋ฐ˜
34
+
35
+ ---
36
+
37
+ ## ์‚ฌ์šฉ ๋ฐฉ๋ฒ• (How to Use)
38
+
39
+ ```python
40
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
41
+
42
+ # ๋ชจ๋ธ ๋กœ๋“œ
43
+ MODEL_PATH = "combe4259/NHSQLNL"
44
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
45
+ model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_PATH)
46
+
47
+ # ์ž…๋ ฅ ์งˆ์˜
48
+ query = "2023๋…„์— ๊ฐœ์„ค๋œ ์˜ˆ๊ธˆ ๊ณ„์ขŒ ์ˆ˜๋ฅผ ์•Œ๋ ค์ค˜"
49
+
50
+ inputs = tokenizer(query, return_tensors="pt")
51
+
52
+ # SQL ์˜ˆ์ธก
53
+ outputs = model.generate(**inputs, max_length=128)
54
+ sql = tokenizer.decode(outputs[0], skip_special_tokens=True)
55
+
56
+ print("์ž…๋ ฅ:", query)
57
+ print("์ƒ์„ฑ๋œ SQL:", sql)
58
+
59
+
60
+ ---
61
+
62
+ ## ํ•™์Šต ๋ฐ์ดํ„ฐ (Training Data)
63
+
64
+ - ์ž์ฒด ๊ตฌ์ถ•ํ•œ ๊ธˆ์œต ๋„๋ฉ”์ธ **์ž์—ฐ์–ด โ†” SQL ๋งคํ•‘ ๋ฐ์ดํ„ฐ์…‹** ์‚ฌ์šฉ
65
+ - ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ: SQL ์Šคํ‚ค๋งˆ ์ •๊ทœํ™” ๋ฐ ํ† ํฌ๋‚˜์ด์ € ๊ธฐ๋ฐ˜ ์ž…๋ ฅ ๋ณ€ํ™˜
66
+
67
+ ---
68
+ ---
69
+
70
+ ## ํ™œ์šฉ ๊ฐ€๋Šฅ ๋ถ„์•ผ (Applications)
71
+
72
+ - ๊ธˆ์œต๊ถŒ ์ฑ—๋ด‡ ๋ฐ ์ƒ๋‹ด ์ž๋™ํ™”
73
+ - ์ž์—ฐ์–ด ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ์กฐํšŒ ๋ฐ ๋ฆฌํฌํŠธ ์ƒ์„ฑ
74
+ - ๋น„์ „๋ฌธ๊ฐ€ ๋Œ€์ƒ SQL ํ•™์Šต/์—ฐ์Šต ๋„๊ตฌ