pranav-pvnn commited on
Commit
c0f7284
·
verified ·
1 Parent(s): 787673f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +85 -0
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ base_model: codellama/CodeLlama-7b-hf
4
+ tags:
5
+ - code
6
+ - llama
7
+ - gguf
8
+ - merged
9
+ - python
10
+ ---
11
+
12
+ # CodeLlama 7B Python AI Assistant (Merged GGUF)
13
+
14
+ This is a merged version of the QLoRA fine-tuned CodeLlama-7B model. The LoRA weights have been merged with the base model and converted to GGUF format for easy deployment.
15
+
16
+ ## Model Details
17
+
18
+ - **Base Model**: CodeLlama-7b-hf
19
+ - **Original LoRA Adapter**: pranav-pvnn/codellama-7b-python-ai-assistant
20
+ - **Fine-tuning Method**: QLoRA (4-bit quantization with LoRA)
21
+ - **Format**: GGUF (self-contained, no separate adapter needed)
22
+ - **Training Framework**: Unsloth
23
+
24
+ ## Available Quantizations
25
+
26
+ - `codellama-7b-merged-f16.gguf` - Full precision (FP16) - ~13 GB
27
+ - `codellama-7b-merged-Q4_K_M.gguf` - 4-bit quantization (recommended) - ~4 GB
28
+ - `codellama-7b-merged-Q5_K_M.gguf` - 5-bit quantization (higher quality) - ~5 GB
29
+ - `codellama-7b-merged-Q8_0.gguf` - 8-bit quantization (highest quality) - ~7 GB
30
+
31
+ ## Usage
32
+
33
+ ### With llama.cpp:
34
+ ```bash
35
+ ./llama-cli -m codellama-7b-merged-Q4_K_M.gguf -p "### Instruction:\nWrite a Python function to calculate factorial.\n### Response:\n"
36
+ ```
37
+
38
+ ### With Python (llama-cpp-python):
39
+ ```python
40
+ from llama_cpp import Llama
41
+
42
+ llm = Llama(model_path="codellama-7b-merged-Q4_K_M.gguf")
43
+ prompt = "### Instruction:\nWrite a Python function to calculate factorial.\n### Response:\n"
44
+ output = llm(prompt, max_tokens=256)
45
+ print(output['choices'][0]['text'])
46
+ ```
47
+
48
+ ### With Ollama:
49
+ 1. Create a Modelfile:
50
+ ```
51
+ FROM ./codellama-7b-merged-Q4_K_M.gguf
52
+ ```
53
+
54
+ 2. Create the model:
55
+ ```bash
56
+ ollama create my-codellama -f Modelfile
57
+ ollama run my-codellama "Write a Python function to sort a list"
58
+ ```
59
+
60
+ ## Training Details
61
+
62
+ - **Quantization**: 4-bit QLoRA
63
+ - **LoRA Rank**: 64
64
+ - **Learning Rate**: 2e-4
65
+ - **Epochs**: 4
66
+ - **Max Seq Length**: 2048
67
+ - **Training Data**: Custom Python programming examples (~2,000 examples)
68
+ - **GPU**: NVIDIA Tesla T4
69
+
70
+ ## Prompt Format
71
+
72
+ ```
73
+ ### Instruction:
74
+ [Your instruction here]
75
+ ### Response:
76
+ ```
77
+
78
+ ## License
79
+
80
+ Same as base model (Llama 2 license)
81
+
82
+ ## Acknowledgements
83
+
84
+ - Base Model: [Meta's CodeLlama](https://huggingface.co/codellama/CodeLlama-7b-hf)
85
+ - Training Framework: [Unsloth](https://github.com/unslothai/unsloth)