README.md · quocvibui/rhino-coder-7b at main

File size: 5,875 Bytes

7dce080

---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
library_name: mlx
tags:
  - mlx
  - lora
  - code
  - rhino3d
  - rhinoscriptsyntax
  - rhinocommon
  - 3d-modeling
  - cad
  - python
datasets:
  - custom
language:
  - en
pipeline_tag: text-generation
model-index:
  - name: rhino-coder-7b
    results: []
---

# Rhino Coder 7B

A fine-tuned [Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) model specialized for **Rhino3D Python scripting** — generating correct `rhinoscriptsyntax` and `RhinoCommon` code from natural language instructions.

This is the **fused model** (LoRA weights merged into base). For the standalone LoRA adapter, see [rhino-coder-7b-lora](https://huggingface.co/quocvibui/rhino-coder-7b-lora).

## Why Fine-Tune?

The base Qwen2.5-Coder-7B is a strong general code model, but it doesn't know Rhino's APIs. On 10 held-out Rhino scripting tasks:

| Metric | Base Model | Fine-Tuned | Delta |
|--------|-----------|------------|-------|
| Avg code lines | 11.9 | 8.2 | -3.7 (more concise) |
| Avg code chars | 427 | 258 | -40% less bloat |

- **Base model hallucinates APIs** — invents `Rhino.Commands.Command.AddPoint()`, `rs.filter.surface`, `rg.PipeSurface.Create()` — none of these exist
- **Fine-tuned uses correct APIs** — `rs.CurveAreaCentroid()`, `rs.AddPipe()`, `rs.GetObject("...", 8)` with the right filter constants
- **Fine-tuned matches reference style** — several outputs are near-identical to the reference solutions

### Example — *"How do I find the centroid of a closed curve?"*

```python
# BASE MODEL — wrong (averages control points, not area centroid)
def find_centroid(curve_id):
    points = rs.CurvePoints(curve_id)
    centroid = [0, 0, 0]
    for point in points:
        centroid[0] += point[0]
        centroid[1] += point[1]
        centroid[2] += point[2]
    centroid[0] /= len(points)
    return centroid

# FINE-TUNED — correct, concise
crv = rs.GetObject('Select closed curve', 4)
if crv and rs.IsCurveClosed(crv):
    centroid = rs.CurveAreaCentroid(crv)
    if centroid:
        rs.AddPoint(centroid[0])
```

## Usage

### With MLX (Apple Silicon)

```bash
pip install mlx-lm
```

```python
from mlx_lm import load, generate

model, tokenizer = load("quocvibui/rhino-coder-7b")

messages = [
    {"role": "system", "content": "You are an expert Rhino3D Python programmer. Write clean, working scripts using rhinoscriptsyntax and RhinoCommon. Include all necessary imports. Only output code, no explanations unless asked."},
    {"role": "user", "content": "Create a 10x10 grid of spheres with radius 0.5"},
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
output = generate(model, tokenizer, prompt=prompt, max_tokens=1024)
print(output)
```

### As an OpenAI-compatible server

```bash
mlx_lm server --model quocvibui/rhino-coder-7b --port 8080
```

Then query it like any OpenAI-compatible API:

```python
import requests

response = requests.post("http://localhost:8080/v1/chat/completions", json={
    "model": "default",
    "messages": [
        {"role": "system", "content": "You are an expert Rhino3D Python programmer. Write clean, working scripts using rhinoscriptsyntax and RhinoCommon. Include all necessary imports. Only output code, no explanations unless asked."},
        {"role": "user", "content": "Draw a spiral staircase with 20 steps"}
    ],
    "max_tokens": 1024,
    "temperature": 0.1
})
print(response.json()["choices"][0]["message"]["content"])
```

## Training Details

### Method

LoRA (Low-Rank Adaptation) fine-tuning via [MLX-LM](https://github.com/ml-explore/mlx-examples), then fused into the base model.

### Hyperparameters

| Parameter | Value |
|-----------|-------|
| Base model | Qwen2.5-Coder-7B-Instruct (4-bit) |
| Method | LoRA |
| LoRA rank | 8 |
| LoRA scale | 20.0 |
| LoRA dropout | 0.0 |
| LoRA layers | 16 / 28 |
| Batch size | 1 |
| Learning rate | 1e-5 |
| Optimizer | Adam |
| Max sequence length | 2,048 |
| Iterations | 9,108 (2 epochs) |
| Validation loss | 0.184 |
| Training time | ~1.2 hours on M2 Max |

### Dataset

5,060 instruction-code pairs for Rhino3D Python scripting (90/10 train/val split):

| Source | Count |
|--------|-------|
| RhinoCommon API docs | 1,355 |
| RhinoScriptSyntax source | 926 |
| Official samples | 93 |
| Synthetic generation | 187 |
| Backlabeled GitHub | 1 |

**API coverage:**

| API | Pairs |
|-----|-------|
| RhinoCommon | 1,409 |
| rhinoscriptsyntax | 1,134 |
| rhino3dm | 18 |
| compute | 1 |

Data was cleaned aggressively — 10,252 entries excluded from 12,814 total raw entries. Filters removed trivial getters, boilerplate, placeholder code, C#-only types, and duplicates.

### Chat format

```json
{
  "messages": [
    {"role": "system", "content": "You are an expert Rhino3D Python programmer..."},
    {"role": "user", "content": "<instruction>"},
    {"role": "assistant", "content": "<python code>"}
  ]
}
```

## Intended Use

- Generating Python scripts for Rhino3D (rhinoscriptsyntax / RhinoCommon)
- Computational design and 3D modeling automation
- Interactive code generation in a Rhino 8 REPL workflow

## Limitations

- Trained on Rhino3D Python APIs only — not a general-purpose coding model
- Best results with rhinoscriptsyntax (`rs.*`) and RhinoCommon (`Rhino.Geometry.*`)
- May not cover every API method — training data focused on the most commonly used patterns
- Quantized to 4-bit — some precision tradeoffs vs. full-precision models
- Optimized for MLX on Apple Silicon; for GPU inference, you may need to convert weights

## Links

- [GitHub: rhino3d-SLM](https://github.com/quocvibui/rhino3d-SLM)
- [LoRA Adapter](https://huggingface.co/quocvibui/rhino-coder-7b-lora)
- [Base model: Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)