snsnc's picture
Upload README.md with huggingface_hub
d6d6e95 verified
|
Raw
History Blame Contribute Delete
1.44 kB
---
license: apache-2.0
base_model: ByteDance-Seed/Seed-Coder-8B-Instruct
tags:
- gguf
- llama.cpp
- seed-coder
- rust
- strandset
- code
- q4_k_m
- q6_k
---
# Seed-Coder-8B-Instruct Rust Strandset GGUF
GGUF export of a Seed-Coder-8B-Instruct LoRA trained on a 20k-row subset of `Fortytwo-Network/Strandset-Rust-v1`.
Base model:
`ByteDance-Seed/Seed-Coder-8B-Instruct`
LoRA adapter:
`snsnc/Seed-Coder-8B-Instruct-Rust-Strandset-LoRA`
Dataset:
`Fortytwo-Network/Strandset-Rust-v1`
Mapping used:
- `input_data` → user
- `output_data` → assistant
- `task_category` → system
- `crate_name` → system
- `test` → none
Training config:
- Method: LoRA
- Context: 4096
- Epochs: 1
- LR: 2e-4
- Rank: 16
- Alpha: 32
- Dropout: 0.0
- Batch size: 16
- Grad accum: 2
- Effective batch: 32
- Weight decay: 0.001
- Warmup steps: 25
- Packing: off
- Train on completions: on
Files:
- `Seed-Coder-8B-Instruct.Q4_K_M.gguf`
- `Seed-Coder-8B-Instruct.Q6_K.gguf`
Note: this model was trained on Strandset-style structured Rust code tasks. It may emit JSON-style wrappers such as `{"code": "..."}` depending on prompting.
## llama.cpp
```bash
llama-cli \
-m Seed-Coder-8B-Instruct.Q4_K_M.gguf \
--jinja \
--single-turn \
--temp 0.1 \
--top-p 0.8 \
--repeat-penalty 1.05 \
-p "Write only Rust code. No markdown. Implement parse_duration(s: &str) -> Result<std::time::Duration, String> supporting 10s, 5m, 2h. Include tests."