How to use from
Docker Model Runner
docker model run hf.co/Wizcoderr/qwen-flutter-fused
Quick Links

GenMobiAi — Qwen2.5-Coder-14B Flutter Specialist

GenMobiAi is a fine-tuned version of Qwen2.5-Coder-14B-Instruct specialized for Flutter and Dart development. Optimized for agentic code generation, mobile development, and multi-framework orchestration.

Overview

Type: Code Generation + Agentic AI
Parameters: 14.77B
Architecture: Qwen2ForCausalLM (48 layers)
Context Length: 128,000 tokens
Quantization: 4-bit MLX (group_size=64)
Training Method: QLoRA fine-tuning via MLX-LM
Training Data: 311 Flutter/Dart samples from flutter.dev + pub.dev
License: Apache 2.0

Key Features

Flutter Code Generation

  • Widgets: StatelessWidget, StatefulWidget, custom widgets, Material 3 design
  • State Management: Provider, Riverpod, GetX, BLoC, MobX patterns
  • Async Dart: Futures, Streams, isolates, error handling
  • Architecture: MVVM, Clean Architecture, Repository pattern

Pub.dev Package Intelligence

  • HTTP clients (Dio, http with interceptors)
  • Local storage (hive, shared_preferences)
  • Animations (flutter_animate, lottie)
  • Testing (widget tests, unit tests with mockito)

Agentic Capabilities

  • ChatML format with tool-call support (LangGraph-compatible)
  • Multi-message context preservation
  • Structured JSON tool responses

Quick Start

Transformers (CPU/GPU)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("your-org/genmobiai-qwen2.5-coder-14b-flutter")
model = AutoModelForCausalLM.from_pretrained(
    "your-org/genmobiai-qwen2.5-coder-14b-flutter",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are GenMobiAi, an expert Flutter developer."},
    {"role": "user", "content": "Create a Riverpod provider for a shopping cart."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=1024, temperature=0.3, top_p=0.9)
print(tokenizer.decode(output[0], skip_special_tokens=True))

MLX-LM (Apple Silicon, recommended)

python -m mlx_lm.generate \
  --model path/to/genmobiai-qwen2.5-coder-14b-flutter \
  --prompt "Write a Flutter Counter widget with SharedPreferences persistence" \
  --max-tokens 1024 \
  --temp 0.3

vLLM (High-Throughput)

from vllm import LLM, SamplingParams

llm = LLM("path/to/genmobiai-qwen2.5-coder-14b-flutter", max_model_len=8192)
outputs = llm.generate(
    ["<|im_start|>user\nWrite a Flutter auth provider<|im_end|>\n"],
    SamplingParams(temperature=0.3, top_p=0.9, max_tokens=1024)
)
print(outputs[0].outputs[0].text)

Ollama

# Convert to GGUF first
python -m llama_cpp.server --model path/genmobiai-q4_k_m.gguf --port 8000

# Or use Modelfile
ollama create genmobiai -f - <<EOF
FROM ./genmobiai-q4_k_m.gguf
SYSTEM "You are GenMobiAi, an expert Flutter developer."
PARAMETER temperature 0.3
PARAMETER top_p 0.9
EOF

ollama run genmobiai "Build a Flutter provider for authentication"

Recommended Sampling Parameters

Use Case Temperature Top-P Top-K Repetition Penalty
Code Generation 0.3 0.9 40 1.05
Complex Logic 0.5 0.95 50 1.0
Agentic Output 0.2 0.85 40 1.1
Creative Patterns 0.7 0.95 50 0.95

Model Specifications

Architecture

  • Model Type: Qwen2ForCausalLM
  • Hidden Size: 5,120
  • Intermediate Size: 13,824
  • Num Layers: 48
  • Num Attention Heads: 40
  • Num KV Heads: 8
  • RoPE Theta: 1,000,000
  • Max Position Embeddings: 128,000

Tokenizer

  • Type: Qwen2Tokenizer
  • Vocab Size: 152,064
  • EOS Token: <|im_end|> (151645)
  • PAD Token: <|endoftext|> (151643)
  • Special Tokens: ChatML (<|im_start|>, <|im_end|>) + tool-call markers

Quantization (MLX)

  • Bits: 4
  • Group Size: 64
  • Reduces Size: ~28GB (BF16) → ~8.3GB (4-bit)

Training Configuration

Dataset: 311 Flutter/Dart samples (279 train / 32 eval)
Method: QLoRA via MLX-LM on Apple Silicon
LoRA Rank: 8
Trainable Layers: 16 of 48
Batch Size: 1 | Grad Accumulation: 2
Learning Rate: 1e-5
Max Seq Length: 1,024
Iterations: 1,000
Estimated Training Time: 4–8 hours (M3/M4 24GB)

Hardware Requirements

Hardware Memory Inference Speed Use Case
Apple M3/M4 (MLX) 16GB+ 100+ tok/s @ 4K Development
RTX 4090 (BF16) 24GB 200+ tok/s Production
H100 (batched) 80GB 1000+ tok/s Server
CPU (GGUF Q4) 32GB 10–15 tok/s Edge

Capabilities & Use Cases

Flutter Development

  • ✅ Widget scaffolding (Material 3, Cupertino, adaptive)
  • ✅ State management patterns (Provider, Riverpod, GetX, BLoC)
  • ✅ REST API integration (Dio, http, interceptors)
  • ✅ Local storage (hive, shared_preferences, file I/O)
  • ✅ Testing (widget tests, unit tests, integration tests)
  • ✅ Platform channels & native integration

Code Quality

  • Null safety best practices
  • MVVM + Clean Architecture patterns
  • Error handling & logging
  • Performance optimization tips
  • Documentation & inline comments

Agentic Features

  • Tool-call support via XML-wrapped JSON
  • Multi-message context preservation
  • Chat template integration (ChatML)
  • LangGraph workflow compatibility

Limitations

  1. Dataset Size: 311 samples may cause hallucinations on less-documented packages
  2. Quantization Artifacts: 4-bit rounding in floating-point operations
  3. Vision Tokens: Vocabulary includes image tokens (inactive) from multimodal base
  4. Context in Practice: MLX 4-bit inference optimal at 4K–8K tokens on 24GB
  5. No Formal Benchmarks: Performance validated empirically, not on standard evals
  6. Dart 3+ Features: records, sealed classes partially covered

Special Tokens

<|endoftext|>      (ID: 151643)  → Padding / Fallback EOS
<|im_start|>       (ID: 151644)  → ChatML message start
<|im_end|>         (ID: 151645)  → ChatML message end (Primary EOS)
<tool_call>        (Custom)       → Agentic tool invocation (XML wrapper)
</tool_call>       (Custom)       → Agentic tool response end

Citation

@misc{genmobiai2025,
  title   = {GenMobiAi: Qwen2.5-Coder-14B Fine-tuned for Flutter/Dart Development},
  author  = {GenMobiAi Contributors},
  year    = {2025},
  url     = {https://huggingface.co/your-org/genmobiai-qwen2.5-coder-14b-flutter},
  license = {Apache 2.0}
}

@misc{qwen2_5_coder,
  title  = {Qwen2.5-Coder: A Capable Code Language Model},
  author = {Alibaba Cloud},
  year   = {2024},
  url    = {https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct}
}

License

This model is licensed under the Apache License 2.0.

  • Base Model: Qwen2.5-Coder-14B-Instruct by Alibaba Cloud (Apache 2.0)
  • Fine-tuning & Specialization: GenMobiAi Contributors (Apache 2.0)
  • Training Data: flutter.dev (BSD 3-Clause), pub.dev packages (per-package), Flutter GitHub (BSD 3-Clause)

See LICENSE for full text.

Contributing

Issues or improvements?

  • Report on GitHub or HF Hub
  • Submit Flutter patterns to expand the training dataset
  • Improve documentation

Last Updated: 2025-05-25
Status: Production-Ready
Framework Support: Transformers, MLX-LM, vLLM, llama.cpp, Ollama

Downloads last month
657
Safetensors
Model size
15B params
Tensor type
F16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Wizcoderr/qwen-flutter-fused

Base model

Qwen/Qwen2.5-14B
Quantized
(102)
this model