ATLASPROGRAM commited on
Commit
73649a3
Β·
verified Β·
1 Parent(s): ddb8f39

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -81
README.md CHANGED
@@ -1,81 +1,81 @@
1
- ---
2
- license: llama2
3
- library_name: peft
4
- tags:
5
- - solana
6
- - rust
7
- - anchor
8
- - smart-contracts
9
- - finance
10
- - crypto
11
- - unsloth
12
- - codellama
13
- base_model: codellama/CodeLlama-7B-Instruct-hf
14
- datasets:
15
- - synthetic-solana-anchor-10k
16
- language:
17
- - en
18
- ---
19
-
20
- # Solana-CodeLlama-7B-v1 (Anchor Specialized)
21
-
22
- ## Overview
23
- **Solana-CodeLlama-7B-v1** is a domain-specialized language model fine-tuned for writing production-ready **Solana Smart Contracts** using the **Anchor Framework**.
24
-
25
- While general coding models (like GPT-4 or standard CodeLlama) often hallucinate outdated syntax or struggle with Rust's strict ownership rules, this model was trained on a **high-purity synthetic dataset** of 10,000 algorithmic examples, focusing specifically on:
26
- * **Anchor Macros:** Correct usage of `#[derive(Accounts)]`, `#[program]`, `#[account]`.
27
- * **Security Constraints:** Proper PDA seed validation and constraint checks (e.g., `#[account(mut, seeds = [...], bump)]`).
28
- * **Rust & SPL Tokens:** Accurate CPI calls to the SPL Token program.
29
-
30
- ## Performance & Benchmarks
31
- The model was evaluated against the base `CodeLlama-7B-Instruct` model on a specific "Solana Hold-Out Set".
32
-
33
- | Metric | Base Model (Zero-Shot) | **Solana-CodeLlama-7B-v1** (Ours) |
34
- | :--- | :---: | :---: |
35
- | **Accuracy (Validation)** | ~35% (Hallucinates Python/Solidtiy) | **97.26%** |
36
- | **Accounts Struct** | ❌ FAIL | βœ… PASS |
37
- | **Context Validation** | ❌ FAIL | βœ… PASS |
38
- | **PDA Initialization** | ❌ FAIL | βœ… PASS |
39
- | **SPL Token Transfer** | ❌ FAIL | βœ… PASS |
40
-
41
- *> "The model didn't just learn; it absorbed the syntax structure instantly, dropping loss to 0.02 in < 2 epochs."*
42
-
43
- ## Dataset
44
- * **Source:** 100% Synthetic (Algorithmic Generation).
45
- * **Size:** 10,000 Verified Examples.
46
- * **Methodology:** We utilized a "Textbook Quality" approach, generating examples with perfect compile-ready logic rather than scraping noisy GitHub repositories.
47
-
48
- ## Usage
49
-
50
- ### 1. Using Unsloth (Fastest)
51
- ```python
52
- from unsloth import FastLanguageModel
53
-
54
- model, tokenizer = FastLanguageModel.from_pretrained(
55
- model_name = "your-username/Solana-CodeLlama-7B-v1",
56
- max_seq_length = 2048,
57
- dtype = None,
58
- load_in_4bit = True,
59
- )
60
-
61
- prompt = """Write a Solana Anchor program to initialize a user vault."""
62
- # ... Apply chat template ...
63
- ```
64
-
65
- ### 2. Using GGUF (Ollama / LM Studio)
66
- This model is available in GGUF format for local deployment on consumer hardware (MacBook M1/M2/M3, NVIDIA RTX 3060/4090/5090).
67
- * `Solana-CodeLlama-7B-v1.Q4_K_M.gguf` (Recommended for 8GB+ RAM)
68
- * `Solana-CodeLlama-7B-v1.Q8_0.gguf` (High Precision)
69
-
70
- ## Training Details
71
- * **Hardware:** NVIDIA RTX 5090 (24GB VRAM).
72
- * **Framework:** Unsloth (Open Source).
73
- * **Precision:** Mixed Precision (BF16).
74
- * **LoRA Rank:** 16.
75
- * **Batch Size:** 8 (Effective).
76
-
77
- ## License
78
- Based on CodeLlama (Llama 2 Community License).
79
-
80
- ---
81
- *Fine-tuned with ❀️ using [Unsloth](https://github.com/unslothai/unsloth).*
 
1
+ ---
2
+ license: llama2
3
+ library_name: peft
4
+ tags:
5
+ - solana
6
+ - rust
7
+ - anchor
8
+ - smart-contracts
9
+ - finance
10
+ - crypto
11
+ - unsloth
12
+ - codellama
13
+ base_model: codellama/CodeLlama-7B-Instruct-hf
14
+ datasets:
15
+ - synthetic-solana-anchor-10k
16
+ language:
17
+ - en
18
+ ---
19
+
20
+ # Solana-CodeLlama-7B-v1 (Anchor Specialized)
21
+
22
+ ## Overview
23
+ **Solana-CodeLlama-7B-v1** is a domain-specialized language model fine-tuned for writing production-ready **Solana Smart Contracts** using the **Anchor Framework**.
24
+
25
+ While general coding models (like GPT-4 or standard CodeLlama) often hallucinate outdated syntax or struggle with Rust's strict ownership rules, this model was trained on a **high-purity synthetic dataset** of 10,000 algorithmic examples, focusing specifically on:
26
+ * **Anchor Macros:** Correct usage of `#[derive(Accounts)]`, `#[program]`, `#[account]`.
27
+ * **Security Constraints:** Proper PDA seed validation and constraint checks (e.g., `#[account(mut, seeds = [...], bump)]`).
28
+ * **Rust & SPL Tokens:** Accurate CPI calls to the SPL Token program.
29
+
30
+ ## Performance & Benchmarks
31
+ The model was evaluated against the base `CodeLlama-7B-Instruct` model on a specific "Solana Hold-Out Set".
32
+
33
+ | Metric | Base Model (Zero-Shot) | **Solana-CodeLlama-7B-v1** |
34
+ | :--- | :---: | :---: |
35
+ | **Accuracy (Validation)** | ~35% (Hallucinates Python/Solidtiy) | **97.26%** |
36
+ | **Accounts Struct** | ❌ FAIL | βœ… PASS |
37
+ | **Context Validation** | ❌ FAIL | βœ… PASS |
38
+ | **PDA Initialization** | ❌ FAIL | βœ… PASS |
39
+ | **SPL Token Transfer** | ❌ FAIL | βœ… PASS |
40
+
41
+ *> "The model didn't just learn; it absorbed the syntax structure instantly, dropping loss to 0.02 in < 2 epochs."*
42
+
43
+ ## Dataset
44
+ * **Source:** 100% Synthetic (Algorithmic Generation).
45
+ * **Size:** 10,000 Verified Examples.
46
+ * **Methodology:** We utilized a "Textbook Quality" approach, generating examples with perfect compile-ready logic rather than scraping noisy GitHub repositories.
47
+
48
+ ## Usage
49
+
50
+ ### 1. Using Unsloth (Fastest)
51
+ ```python
52
+ from unsloth import FastLanguageModel
53
+
54
+ model, tokenizer = FastLanguageModel.from_pretrained(
55
+ model_name = "your-username/Solana-CodeLlama-7B-v1",
56
+ max_seq_length = 2048,
57
+ dtype = None,
58
+ load_in_4bit = True,
59
+ )
60
+
61
+ prompt = """Write a Solana Anchor program to initialize a user vault."""
62
+ # ... Apply chat template ...
63
+ ```
64
+
65
+ ### 2. Using GGUF (Ollama / LM Studio)
66
+ This model is available in GGUF format for local deployment on consumer hardware (MacBook M1/M2/M3, NVIDIA RTX 3060/4090/5090).
67
+ * `Solana-CodeLlama-7B-v1.Q4_K_M.gguf` (Recommended for 8GB+ RAM)
68
+ * `Solana-CodeLlama-7B-v1.Q8_0.gguf` (High Precision)
69
+
70
+ ## Training Details
71
+ * **Hardware:** NVIDIA RTX 5090 (32GB VRAM).
72
+ * **Framework:** Unsloth (Open Source).
73
+ * **Precision:** Mixed Precision (BF16).
74
+ * **LoRA Rank:** 16.
75
+ * **Batch Size:** 8 (Effective).
76
+
77
+ ## License
78
+ Based on CodeLlama (Llama 2 Community License).
79
+
80
+ ---
81
+ *Fine-tuned with ❀️ using [Unsloth](https://github.com/unslothai/unsloth).*