Ghaythfd commited on
Commit
273efd1
·
verified ·
1 Parent(s): d65cec9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +120 -22
README.md CHANGED
@@ -1,22 +1,120 @@
1
- ---
2
- base_model: deepseek-ai/deepseek-coder-6.7b-instruct
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - llama
8
- - trl
9
- license: apache-2.0
10
- language:
11
- - en
12
- ---
13
-
14
- # Uploaded model
15
-
16
- - **Developed by:** Ghaythfd
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** deepseek-ai/deepseek-coder-6.7b-instruct
19
-
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
21
-
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: deepseek-ai/deepseek-coder-6.7b-instruct
4
+ tags:
5
+ - gherkin
6
+ - bdd
7
+ - test-automation
8
+ - cucumber
9
+ - lora
10
+ - peft
11
+ - deepseek
12
+ language:
13
+ - en
14
+ pipeline_tag: text-generation
15
+ library_name: peft
16
+ datasets:
17
+ - Ghaythfd/gherkin-scenarios
18
+ ---
19
+
20
+ # Gherkin Scenario Generator (DeepSeek Coder LoRA)
21
+
22
+ A fine-tuned LoRA adapter for generating Gherkin BDD test scenarios, built on top of DeepSeek Coder 6.7B Instruct.
23
+
24
+ ## Model Description
25
+
26
+ This model generates Gherkin/Cucumber test scenarios for data management systems. It was fine-tuned on real-world BDD test cases covering:
27
+
28
+ - Data import/export (CSV, JSON, Excel)
29
+ - REST and SOAP API testing
30
+ - UI navigation and search
31
+ - Job scheduling and reporting
32
+ - IBOR and financial data operations
33
+
34
+ ## Usage
35
+
36
+ ### With Unsloth (Recommended)
37
+
38
+ ```python
39
+ from unsloth import FastLanguageModel
40
+
41
+ model, tokenizer = FastLanguageModel.from_pretrained(
42
+ model_name="Ghaythfd/gherkin-deepseek-lora",
43
+ max_seq_length=2048,
44
+ load_in_4bit=True,
45
+ )
46
+ FastLanguageModel.for_inference(model)
47
+
48
+ prompt = "### Instruction:\nWrite a Gherkin scenario for testing CSV file import\n\n### Response:\n"
49
+ inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
50
+
51
+ outputs = model.generate(
52
+ **inputs,
53
+ max_new_tokens=512,
54
+ temperature=0.5,
55
+ top_p=0.9,
56
+ repetition_penalty=1.15,
57
+ )
58
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
59
+ ```
60
+
61
+ ### With PEFT/Transformers
62
+
63
+ ```python
64
+ from peft import PeftModel
65
+ from transformers import AutoModelForCausalLM, AutoTokenizer
66
+
67
+ base_model = AutoModelForCausalLM.from_pretrained(
68
+ "deepseek-ai/deepseek-coder-6.7b-instruct",
69
+ device_map="auto",
70
+ load_in_4bit=True,
71
+ )
72
+ model = PeftModel.from_pretrained(base_model, "Ghaythfd/gherkin-deepseek-lora")
73
+ tokenizer = AutoTokenizer.from_pretrained("Ghaythfd/gherkin-deepseek-lora")
74
+ ```
75
+
76
+ ## Prompt Format
77
+
78
+ ```
79
+ ### Instruction:
80
+ {your request here}
81
+
82
+ ### Response:
83
+ ```
84
+
85
+ ## Example Output
86
+
87
+ **Prompt:** "Write a Gherkin scenario for testing CSV file import"
88
+
89
+ **Output:**
90
+ ```gherkin
91
+ Scenario Outline: Testing CSV file import
92
+ Given I am logged as TAV_standard.user on <screen_name>
93
+ And I go to detail screen <detail_screen> of <object_type>
94
+ When I select the tab <tab>
95
+ Then The fields <fields> should be displayed with values <values>
96
+
97
+ Examples:
98
+ | screen_name | object_type | detail_screen | tab | fields | values |
99
+ | My Securities | Equity | Detail | Overview | Name | Test |
100
+ ```
101
+
102
+ ## Training Details
103
+
104
+ - **Base Model:** deepseek-ai/deepseek-coder-6.7b-instruct
105
+ - **Method:** LoRA (Low-Rank Adaptation)
106
+ - **LoRA Rank:** 16
107
+ - **LoRA Alpha:** 16
108
+ - **Training Data:** 726 examples from Gherkin feature files
109
+ - **Epochs:** 1
110
+ - **Framework:** Unsloth + TRL
111
+
112
+ ## Limitations
113
+
114
+ - Generates scenarios in the style of the training data (data management domain)
115
+ - May hallucinate specific field names or values
116
+ - Works best for scenarios similar to the training examples
117
+
118
+ ## License
119
+
120
+ Apache 2.0