Anant58 commited on
Commit
29eb6e4
·
verified ·
1 Parent(s): 9889cb7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -21
README.md CHANGED
@@ -11,7 +11,7 @@ base_model:
11
 
12
  ## Model Overview
13
 
14
- This model has been fine-tuned on the `swe_bench` dataset to automate the generation of bug fixes in software engineering tasks. It leverages issue descriptions, code diffs, and historical bug context to generate precise patches. The primary use case is to assist developers by quickly generating code fixes based on detailed bug descriptions.
15
 
16
  ### Key Features:
17
  - **Patch Generation**: Produces code patches based on issue descriptions and optional context.
@@ -23,11 +23,11 @@ This model has been fine-tuned on the `swe_bench` dataset to automate the genera
23
  The model is designed for developers and software teams to automatically generate code patches for software issues. It can handle a variety of inputs such as issue descriptions and additional context and is ideal for teams dealing with frequent bug reports.
24
 
25
  ### Inputs:
26
- 1. **Issue Description (<issue>)**: Main input, taken from `fix_issue_description`.
27
- 2. **Issue Story (<fix_issue_story>)**: Optional additional context, from `fix_story`.
28
- 3. **Assertions (<assertions>)**: Conditions that the patch must meet, from `fix_assertion_1`, `fix_assertion_2`, etc.
29
- 4. **Bug and PR Context (<bug>)**: Historical bug and PR context from fields like `bug_pr`, `bug_story`, etc., used during fine-tuning but not required for inference.
30
- 5. **File Paths and Code Differences (<file>, <bug_code_diff>, <fix_code_diff>)**: Paths and diffs used to generate a valid code patch.
31
 
32
  ### Outputs:
33
  - **Generated Code Patch**: Based on input issue description and additional context.
@@ -35,7 +35,8 @@ The model is designed for developers and software teams to automatically generat
35
 
36
  ## Dataset
37
 
38
- ### Dataset: `swe_bench-lite`
 
39
  This model is fine-tuned on the `swe_bench` dataset. The dataset includes:
40
  - **<issue> (Issue Description)**: Describes the bug in detail.
41
  - **<fix_issue_story> (Issue Story)**: Provides additional narrative or context around the bug.
@@ -43,6 +44,7 @@ This model is fine-tuned on the `swe_bench` dataset. The dataset includes:
43
  - **<bug> (Bug and PR Context)**: Provides historical context on bugs, used in fine-tuning.
44
  - **<file>, <bug_code_diff>, <fix_code_diff>**: File paths and code diffs used to train the model to generate fixes.
45
 
 
46
  ## Limitations
47
 
48
  - The model relies on well-defined issue descriptions to produce accurate patches.
@@ -69,17 +71,4 @@ print(tokenizer.decode(outputs[0]))
69
  ## Ethical Considerations
70
 
71
  - The generated patches should always be validated and reviewed by developers before deploying in production.
72
- - This model is designed to assist but should not replace thorough code reviews.
73
-
74
- ## Citation
75
-
76
- Please cite this model as follows:
77
-
78
- ```
79
- @inproceedings{tandon_ft_llama3_1_swe_bench,
80
- title={Automatic Bug Fix Generation using Llama-3.1 and SWE Bench Dataset},
81
- author={Gauransh Tandon, Caroline Lemiuex, and Reid Holmes},
82
- year={2024},
83
- publisher={Hugging Face}
84
- }
85
- ```
 
11
 
12
  ## Model Overview
13
 
14
+ This model has been fine-tuned on the `princeton-nlp/SWE-bench_Lite` dataset to automate the generation of bug fixes in software engineering tasks. It leverages issue descriptions, code diffs, and historical bug context to generate precise patches. The primary use case is to assist developers by quickly generating code fixes based on detailed bug descriptions.
15
 
16
  ### Key Features:
17
  - **Patch Generation**: Produces code patches based on issue descriptions and optional context.
 
23
  The model is designed for developers and software teams to automatically generate code patches for software issues. It can handle a variety of inputs such as issue descriptions and additional context and is ideal for teams dealing with frequent bug reports.
24
 
25
  ### Inputs:
26
+ 1. **Issue Description (`<issue>`)**: Main input, taken from `fix_issue_description`.
27
+ 2. **Issue Story (`<fix_issue_story>`)**: Optional additional context, from `fix_story`.
28
+ 3. **Assertions (`<assertions>`)**: Conditions that the patch must meet, from `fix_assertion_1`, `fix_assertion_2`, etc.
29
+ 4. **Bug and PR Context (`<bug>`)**: Historical bug and PR context from fields like `bug_pr`, `bug_story`, etc., used during fine-tuning but not required for inference.
30
+ 5. **File Paths and Code Differences (`<file>`, `<bug_code_diff>`, `<fix_code_diff>`)**: Paths and diffs used to generate a valid code patch.
31
 
32
  ### Outputs:
33
  - **Generated Code Patch**: Based on input issue description and additional context.
 
35
 
36
  ## Dataset
37
 
38
+ ### Dataset: `princeton-nlp/SWE-bench_Lite`
39
+
40
  This model is fine-tuned on the `swe_bench` dataset. The dataset includes:
41
  - **<issue> (Issue Description)**: Describes the bug in detail.
42
  - **<fix_issue_story> (Issue Story)**: Provides additional narrative or context around the bug.
 
44
  - **<bug> (Bug and PR Context)**: Provides historical context on bugs, used in fine-tuning.
45
  - **<file>, <bug_code_diff>, <fix_code_diff>**: File paths and code diffs used to train the model to generate fixes.
46
 
47
+
48
  ## Limitations
49
 
50
  - The model relies on well-defined issue descriptions to produce accurate patches.
 
71
  ## Ethical Considerations
72
 
73
  - The generated patches should always be validated and reviewed by developers before deploying in production.
74
+ - This model is designed to assist but should not replace thorough code reviews.