RustMentor-8B

RustMentor-8B is a 8B-parameter Qwen3-based model fine-tuned for Rust programming education and code review. It bridges concepts from Go, Python, and TypeScript to teach Rust through practical examples and Socratic dialogue.

This repository hosts the LoRA adapter weights. For quantized local inference, see rust-mentor-8b-GGUF.

Model Description

  • Base Model: Qwen/Qwen3-8B
  • Model Type: Causal LM (code tutoring + review)
  • Parameters: 8B
  • Context Length: 2048 tokens
  • Fine-tuning: QLoRA (r=32, alpha=32) with Unsloth optimization
  • License: Apache 2.0
  • Language: English, Rust code
  • System Prompt: Rust programming tutor for experienced Go/Python/TypeScript developers learning Rust by building CLI tools.

What It Is Good At

  • Explaining Rust ownership, borrowing, and lifetimes with Go/Python/TS comparisons
  • Code review with borrow checker explanations
  • Error handling patterns (Result, Option, ?, thiserror, anyhow)
  • Async/await and Tokio patterns
  • Smart pointers (Box, Rc, Arc, RefCell)
  • Pattern matching and enum-based design
  • Trait-based architecture and generics
  • Type conversions (From, Into, AsRef, Deref)
  • Serde & JSON serialization
  • CLI tooling with clap
  • Cargo project structure, modules, and workspaces
  • Testing patterns and documentation

Intended Uses

Primary: Rust programming tutoring, debugging, code review, and guided learning for developers transitioning from Go/Python/TypeScript.

Out-of-scope: General-purpose chat, non-Rust programming, safety-sensitive or factual tasks outside Rust development.

Prompt Examples

"In Go, I just pass values or pointers. What's this ownership thing in Rust?"

"Review this Rust code and explain what the borrow checker is doing:\n\nfn get_longest(a: String, b: String) -> String {\n    if a.len() > b.len() { a } else { b }\n}"

"How do I handle errors in Rust? I'm used to Go's if err != nil pattern."

"How does async work in Rust? In Go I just use goroutines and it's simple."

How to Use

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "sylvester-francis/rust-mentor-8b",
    torch_dtype=torch.float16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("sylvester-francis/rust-mentor-8b")

messages = [
    {"role": "system", "content": "You are RustMentor, an expert Rust programming tutor."},
    {"role": "user", "content": "Explain Rust's ownership model to someone who knows Go."},
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Training Data (Summary)

  • Strandset-Rust-v1: 3,000 samples of Rust code generation, review, refactoring, and bug detection tasks
  • Synthetic tutor conversations: 46 unique hand-crafted Rust tutoring dialogues across 28 topics, covering ownership, error handling, traits, async, smart pointers, macros, serde, testing, and more
  • Style: All conversations draw parallels to Go/Python/TypeScript equivalents

Training Configuration (QLoRA)

Parameter Value
Base Model Qwen/Qwen3-8B
Method QLoRA via Unsloth
LoRA Rank (r) 32
LoRA Alpha 32
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs 3
Batch Size 2 x 4 (effective 8)
Learning Rate 2e-4 (cosine schedule)
Max Sequence Length 2048
Hardware NVIDIA A100 40GB (Google Colab)

Evaluation

Qualitative checks on Rust tutoring prompts show:

  • Clear explanations with Go/Python/TypeScript comparisons
  • Accurate code examples with proper ownership and borrowing
  • Borrow checker explanations in code reviews
  • Appropriate use of idiomatic Rust patterns

Safety & Limitations

  • May generate incorrect code or hallucinate crate APIs — review before production use.
  • Not a replacement for the Rust compiler or clippy — always compile and test generated code.
  • Optimized for tutoring, not production code generation at scale.
  • Training data focuses on CLI/systems patterns; web framework coverage (Axum, Actix) is limited.

License

Apache 2.0 for the fine-tuned adapter; base model (Qwen/Qwen3-8B) license also applies.

Contact

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sylvester-francis/rust-mentor-8b

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Adapter
(874)
this model

Dataset used to train sylvester-francis/rust-mentor-8b