girish00 commited on
Commit
47eeb2f
·
verified ·
1 Parent(s): 40b5417

upload benchmark img and conicai paper

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. ConicAI_paper.md +177 -0
  3. benchmark.png +3 -0
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ benchmark.png filter=lfs diff=lfs merge=lfs -text
ConicAI_paper.md ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # **ConicAI Coding LLM: A Parameter-Efficient Framework for Structured Code Generation and Explanation**
2
+
3
+ ---
4
+
5
+ ## **Abstract**
6
+
7
+ Large Language Models (LLMs) have significantly advanced the field of automated code generation and reasoning. However, traditional fine-tuning approaches remain computationally expensive and often produce unstructured outputs that limit their usability in real-world applications.
8
+
9
+ In this work, we present **ConicAI Coding LLM**, a lightweight and parameter-efficient coding assistant built using Low-Rank Adaptation (LoRA) on top of the Qwen2.5-Coder architecture. The model is designed to generate, debug, and explain code while producing structured outputs that include confidence, relevancy, and hallucination indicators.
10
+
11
+ Our approach demonstrates that compact models can achieve competitive performance with improved interpretability and deployment efficiency, making them suitable for practical developer tools and educational systems.
12
+
13
+ ---
14
+
15
+ ## **1. Introduction**
16
+
17
+ The rapid evolution of LLMs has enabled significant improvements in code generation, debugging, and explanation tasks. Models such as Codex and QwenCoder have shown strong capabilities but require extensive computational resources for training and deployment.
18
+
19
+ Additionally, most existing systems produce **unstructured outputs**, making integration into applications difficult. There is a growing need for models that are:
20
+
21
+ * Computationally efficient
22
+ * Structurally interpretable
23
+ * Easily deployable
24
+
25
+ This work introduces a parameter-efficient solution addressing these challenges.
26
+
27
+ ---
28
+
29
+ ## **2. Problem Statement**
30
+
31
+ Despite advancements, current coding LLMs suffer from:
32
+
33
+ * High computational cost for full fine-tuning
34
+ * Lack of structured outputs
35
+ * Difficulty in integration into real-world systems
36
+ * Limited interpretability of generated results
37
+
38
+ ---
39
+
40
+ ## **3. Proposed Method**
41
+
42
+ We propose **ConicAI Coding LLM**, a framework combining:
43
+
44
+ * **LoRA-based fine-tuning** for efficiency
45
+ * **Instruction-based dataset generation**
46
+ * **Structured inference output design**
47
+
48
+ ---
49
+
50
+ ## **4. Methodology**
51
+
52
+ ### **4.1 Base Model**
53
+
54
+ The model is built on:
55
+
56
+ * **Qwen2.5-Coder-0.5B-Instruct**
57
+
58
+ ---
59
+
60
+ ### **4.2 Fine-Tuning Approach**
61
+
62
+ We apply **LoRA (Low-Rank Adaptation)**:
63
+
64
+ * Reduces trainable parameters
65
+ * Enables local training
66
+ * Maintains performance
67
+
68
+ ---
69
+
70
+ ### **4.3 Dataset Design**
71
+
72
+ The dataset follows an instruction-driven format:
73
+
74
+ * Instruction
75
+ * Input
76
+ * Output
77
+ * Explanation
78
+
79
+ Dataset size: **~5,000 – 10,000 samples**
80
+
81
+ ---
82
+
83
+ ### **4.4 Structured Output Framework**
84
+
85
+ The model produces outputs in structured JSON format:
86
+
87
+ ```json id="1rqx9u"
88
+ {
89
+ "code": "...",
90
+ "explanation": "...",
91
+ "confidence": 0.84,
92
+ "relevancy_score": 0.82,
93
+ "hallucination": false
94
+ }
95
+ ```
96
+
97
+ This enables:
98
+
99
+ * Easy API integration
100
+ * Automated evaluation
101
+ * Better interpretability
102
+
103
+ ---
104
+
105
+ ## **5. Evaluation**
106
+
107
+ ### **5.1 Metrics**
108
+
109
+ We evaluate the model using:
110
+
111
+ * Code Correctness (%)
112
+ * Syntax Validity (%)
113
+ * Relevancy Score
114
+ * Hallucination Rate (%)
115
+ * Confidence Score
116
+ * Latency (ms)
117
+
118
+ ---
119
+
120
+ ### **5.2 Results**
121
+
122
+ The model demonstrates:
123
+
124
+ * Improved correctness over baseline models
125
+ * Lower hallucination rates
126
+ * More stable structured outputs
127
+
128
+ ---
129
+
130
+ ## **6. Benchmark Visualization**
131
+
132
+ ![Benchmark Results](./benchmark.png)
133
+
134
+ The results indicate that ConicAI achieves better performance in correctness, syntax validity, and confidence, while maintaining lower hallucination rates compared to baseline models.
135
+
136
+ ---
137
+
138
+ ## **7. Results Analysis**
139
+
140
+ * **Higher correctness** due to instruction-based fine-tuning
141
+ * **Lower hallucination** from structured output constraints
142
+ * **Better usability** due to JSON output format
143
+
144
+ ---
145
+
146
+ ## **8. Limitations**
147
+
148
+ * Limited dataset diversity
149
+ * Heuristic-based confidence estimation
150
+ * Lack of standardized benchmark evaluation
151
+
152
+ ---
153
+
154
+ ## **9. Future Work**
155
+
156
+ Future improvements include:
157
+
158
+ * Scaling dataset size and diversity
159
+ * Benchmarking on datasets like HumanEval and MBPP
160
+ * Improving hallucination detection methods
161
+ * Building user interfaces and APIs
162
+
163
+ ---
164
+
165
+ ## **10. Conclusion**
166
+
167
+ This work demonstrates that a compact coding LLM can be effectively enhanced using LoRA to achieve efficient training, structured outputs, and improved usability. The proposed approach bridges the gap between research models and practical deployment systems.
168
+
169
+ ---
170
+
171
+ ## **References**
172
+
173
+ * Hugging Face Transformers
174
+ * PEFT: Parameter-Efficient Fine-Tuning
175
+ * Qwen2.5-Coder Model
176
+
177
+ ---
benchmark.png ADDED

Git LFS Details

  • SHA256: 5187ea0348dcedf464f56870bd2883d7a2e1915f9129d96d0c13e71680675f8a
  • Pointer size: 131 Bytes
  • Size of remote file: 329 kB