---
license: apache-2.0
language:
- en
tags:
- flutter
- dart
- code-generation
- mobile-development
- qwen
- qwen2.5-coder
- mlx
- transformers
- vllm
- text-generation
- agentic
- agent
library_name: transformers
pipeline_tag: text-generation
base_model: Qwen/Qwen2.5-Coder-14B-Instruct
datasets:
- flutter_docs_alpaca
---

# GenMobiAi — Qwen2.5-Coder-14B Flutter Specialist

**GenMobiAi** is a fine-tuned version of [Qwen2.5-Coder-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct) specialized for Flutter and Dart development. Optimized for agentic code generation, mobile development, and multi-framework orchestration.

## Overview

**Type**: Code Generation + Agentic AI  
**Parameters**: 14.77B  
**Architecture**: Qwen2ForCausalLM (48 layers)  
**Context Length**: 128,000 tokens  
**Quantization**: 4-bit MLX (group_size=64)  
**Training Method**: QLoRA fine-tuning via MLX-LM  
**Training Data**: 311 Flutter/Dart samples from flutter.dev + pub.dev  
**License**: Apache 2.0

## Key Features

### Flutter Code Generation
- **Widgets**: StatelessWidget, StatefulWidget, custom widgets, Material 3 design
- **State Management**: Provider, Riverpod, GetX, BLoC, MobX patterns
- **Async Dart**: Futures, Streams, isolates, error handling
- **Architecture**: MVVM, Clean Architecture, Repository pattern

### Pub.dev Package Intelligence
- HTTP clients (Dio, http with interceptors)
- Local storage (hive, shared_preferences)
- Animations (flutter_animate, lottie)
- Testing (widget tests, unit tests with mockito)

### Agentic Capabilities
- ChatML format with tool-call support (LangGraph-compatible)
- Multi-message context preservation
- Structured JSON tool responses

## Quick Start

### Transformers (CPU/GPU)

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("your-org/genmobiai-qwen2.5-coder-14b-flutter")
model = AutoModelForCausalLM.from_pretrained(
    "your-org/genmobiai-qwen2.5-coder-14b-flutter",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are GenMobiAi, an expert Flutter developer."},
    {"role": "user", "content": "Create a Riverpod provider for a shopping cart."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=1024, temperature=0.3, top_p=0.9)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```

### MLX-LM (Apple Silicon, recommended)

```bash
python -m mlx_lm.generate \
  --model path/to/genmobiai-qwen2.5-coder-14b-flutter \
  --prompt "Write a Flutter Counter widget with SharedPreferences persistence" \
  --max-tokens 1024 \
  --temp 0.3
```

### vLLM (High-Throughput)

```python
from vllm import LLM, SamplingParams

llm = LLM("path/to/genmobiai-qwen2.5-coder-14b-flutter", max_model_len=8192)
outputs = llm.generate(
    ["<|im_start|>user\nWrite a Flutter auth provider<|im_end|>\n"],
    SamplingParams(temperature=0.3, top_p=0.9, max_tokens=1024)
)
print(outputs[0].outputs[0].text)
```

### Ollama

```bash
# Convert to GGUF first
python -m llama_cpp.server --model path/genmobiai-q4_k_m.gguf --port 8000

# Or use Modelfile
ollama create genmobiai -f - <<EOF
FROM ./genmobiai-q4_k_m.gguf
SYSTEM "You are GenMobiAi, an expert Flutter developer."
PARAMETER temperature 0.3
PARAMETER top_p 0.9
EOF

ollama run genmobiai "Build a Flutter provider for authentication"
```

## Recommended Sampling Parameters

| Use Case | Temperature | Top-P | Top-K | Repetition Penalty |
|----------|------------|-------|-------|-------------------|
| Code Generation | 0.3 | 0.9 | 40 | 1.05 |
| Complex Logic | 0.5 | 0.95 | 50 | 1.0 |
| Agentic Output | 0.2 | 0.85 | 40 | 1.1 |
| Creative Patterns | 0.7 | 0.95 | 50 | 0.95 |

## Model Specifications

### Architecture
- **Model Type**: Qwen2ForCausalLM
- **Hidden Size**: 5,120
- **Intermediate Size**: 13,824
- **Num Layers**: 48
- **Num Attention Heads**: 40
- **Num KV Heads**: 8
- **RoPE Theta**: 1,000,000
- **Max Position Embeddings**: 128,000

### Tokenizer
- **Type**: Qwen2Tokenizer
- **Vocab Size**: 152,064
- **EOS Token**: `<|im_end|>` (151645)
- **PAD Token**: `<|endoftext|>` (151643)
- **Special Tokens**: ChatML (`<|im_start|>`, `<|im_end|>`) + tool-call markers

### Quantization (MLX)
- **Bits**: 4
- **Group Size**: 64
- **Reduces Size**: ~28GB (BF16) → ~8.3GB (4-bit)

## Training Configuration

**Dataset**: 311 Flutter/Dart samples (279 train / 32 eval)  
**Method**: QLoRA via MLX-LM on Apple Silicon  
**LoRA Rank**: 8  
**Trainable Layers**: 16 of 48  
**Batch Size**: 1 | **Grad Accumulation**: 2  
**Learning Rate**: 1e-5  
**Max Seq Length**: 1,024  
**Iterations**: 1,000  
**Estimated Training Time**: 4–8 hours (M3/M4 24GB)

## Hardware Requirements

| Hardware | Memory | Inference Speed | Use Case |
|----------|--------|-----------------|----------|
| Apple M3/M4 (MLX) | 16GB+ | 100+ tok/s @ 4K | Development |
| RTX 4090 (BF16) | 24GB | 200+ tok/s | Production |
| H100 (batched) | 80GB | 1000+ tok/s | Server |
| CPU (GGUF Q4) | 32GB | 10–15 tok/s | Edge |

## Capabilities & Use Cases

### Flutter Development
- ✅ Widget scaffolding (Material 3, Cupertino, adaptive)
- ✅ State management patterns (Provider, Riverpod, GetX, BLoC)
- ✅ REST API integration (Dio, http, interceptors)
- ✅ Local storage (hive, shared_preferences, file I/O)
- ✅ Testing (widget tests, unit tests, integration tests)
- ✅ Platform channels & native integration

### Code Quality
- Null safety best practices
- MVVM + Clean Architecture patterns
- Error handling & logging
- Performance optimization tips
- Documentation & inline comments

### Agentic Features
- Tool-call support via XML-wrapped JSON
- Multi-message context preservation
- Chat template integration (ChatML)
- LangGraph workflow compatibility

## Limitations

1. **Dataset Size**: 311 samples may cause hallucinations on less-documented packages
2. **Quantization Artifacts**: 4-bit rounding in floating-point operations
3. **Vision Tokens**: Vocabulary includes image tokens (inactive) from multimodal base
4. **Context in Practice**: MLX 4-bit inference optimal at 4K–8K tokens on 24GB
5. **No Formal Benchmarks**: Performance validated empirically, not on standard evals
6. **Dart 3+ Features**: records, sealed classes partially covered

## Special Tokens

```
<|endoftext|>      (ID: 151643)  → Padding / Fallback EOS
<|im_start|>       (ID: 151644)  → ChatML message start
<|im_end|>         (ID: 151645)  → ChatML message end (Primary EOS)
<tool_call>        (Custom)       → Agentic tool invocation (XML wrapper)
</tool_call>       (Custom)       → Agentic tool response end
```

## Citation

```bibtex
@misc{genmobiai2025,
  title   = {GenMobiAi: Qwen2.5-Coder-14B Fine-tuned for Flutter/Dart Development},
  author  = {GenMobiAi Contributors},
  year    = {2025},
  url     = {https://huggingface.co/your-org/genmobiai-qwen2.5-coder-14b-flutter},
  license = {Apache 2.0}
}

@misc{qwen2_5_coder,
  title  = {Qwen2.5-Coder: A Capable Code Language Model},
  author = {Alibaba Cloud},
  year   = {2024},
  url    = {https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct}
}
```

## License

This model is licensed under the **Apache License 2.0**.

- **Base Model**: Qwen2.5-Coder-14B-Instruct by Alibaba Cloud (Apache 2.0)
- **Fine-tuning & Specialization**: GenMobiAi Contributors (Apache 2.0)
- **Training Data**: flutter.dev (BSD 3-Clause), pub.dev packages (per-package), Flutter GitHub (BSD 3-Clause)

See [LICENSE](./LICENSE) for full text.

## Contributing

Issues or improvements? 
- Report on [GitHub](https://github.com/your-org/genmobiai) or [HF Hub](https://huggingface.co/your-org/genmobiai-qwen2.5-coder-14b-flutter)
- Submit Flutter patterns to expand the training dataset
- Improve documentation

---

**Last Updated**: 2025-05-25  
**Status**: Production-Ready  
**Framework Support**: Transformers, MLX-LM, vLLM, llama.cpp, Ollama