Text Generation
Transformers
Safetensors
MLX
English
qwen2
flutter
dart
code-generation
mobile-development
qwen
qwen2.5-coder
vllm
agentic
agent
conversational
text-generation-inference
4-bit precision
Instructions to use Wizcoderr/qwen-flutter-fused with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Wizcoderr/qwen-flutter-fused with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Wizcoderr/qwen-flutter-fused") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Wizcoderr/qwen-flutter-fused") model = AutoModelForCausalLM.from_pretrained("Wizcoderr/qwen-flutter-fused") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - MLX
How to use Wizcoderr/qwen-flutter-fused with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("Wizcoderr/qwen-flutter-fused") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- vLLM
How to use Wizcoderr/qwen-flutter-fused with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Wizcoderr/qwen-flutter-fused" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Wizcoderr/qwen-flutter-fused", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Wizcoderr/qwen-flutter-fused
- SGLang
How to use Wizcoderr/qwen-flutter-fused with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Wizcoderr/qwen-flutter-fused" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Wizcoderr/qwen-flutter-fused", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Wizcoderr/qwen-flutter-fused" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Wizcoderr/qwen-flutter-fused", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Pi
How to use Wizcoderr/qwen-flutter-fused with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Wizcoderr/qwen-flutter-fused"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Wizcoderr/qwen-flutter-fused" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Wizcoderr/qwen-flutter-fused with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "Wizcoderr/qwen-flutter-fused"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Wizcoderr/qwen-flutter-fused
Run Hermes
hermes
- MLX LM
How to use Wizcoderr/qwen-flutter-fused with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "Wizcoderr/qwen-flutter-fused"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "Wizcoderr/qwen-flutter-fused" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Wizcoderr/qwen-flutter-fused", "messages": [ {"role": "user", "content": "Hello"} ] }' - Docker Model Runner
How to use Wizcoderr/qwen-flutter-fused with Docker Model Runner:
docker model run hf.co/Wizcoderr/qwen-flutter-fused
| license: apache-2.0 | |
| language: | |
| - en | |
| tags: | |
| - flutter | |
| - dart | |
| - code-generation | |
| - mobile-development | |
| - qwen | |
| - qwen2.5-coder | |
| - mlx | |
| - transformers | |
| - vllm | |
| - text-generation | |
| - agentic | |
| - agent | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| base_model: Qwen/Qwen2.5-Coder-14B-Instruct | |
| datasets: | |
| - flutter_docs_alpaca | |
| # GenMobiAi β Qwen2.5-Coder-14B Flutter Specialist | |
| **GenMobiAi** is a fine-tuned version of [Qwen2.5-Coder-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct) specialized for Flutter and Dart development. Optimized for agentic code generation, mobile development, and multi-framework orchestration. | |
| ## Overview | |
| **Type**: Code Generation + Agentic AI | |
| **Parameters**: 14.77B | |
| **Architecture**: Qwen2ForCausalLM (48 layers) | |
| **Context Length**: 128,000 tokens | |
| **Quantization**: 4-bit MLX (group_size=64) | |
| **Training Method**: QLoRA fine-tuning via MLX-LM | |
| **Training Data**: 311 Flutter/Dart samples from flutter.dev + pub.dev | |
| **License**: Apache 2.0 | |
| ## Key Features | |
| ### Flutter Code Generation | |
| - **Widgets**: StatelessWidget, StatefulWidget, custom widgets, Material 3 design | |
| - **State Management**: Provider, Riverpod, GetX, BLoC, MobX patterns | |
| - **Async Dart**: Futures, Streams, isolates, error handling | |
| - **Architecture**: MVVM, Clean Architecture, Repository pattern | |
| ### Pub.dev Package Intelligence | |
| - HTTP clients (Dio, http with interceptors) | |
| - Local storage (hive, shared_preferences) | |
| - Animations (flutter_animate, lottie) | |
| - Testing (widget tests, unit tests with mockito) | |
| ### Agentic Capabilities | |
| - ChatML format with tool-call support (LangGraph-compatible) | |
| - Multi-message context preservation | |
| - Structured JSON tool responses | |
| ## Quick Start | |
| ### Transformers (CPU/GPU) | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| import torch | |
| tokenizer = AutoTokenizer.from_pretrained("your-org/genmobiai-qwen2.5-coder-14b-flutter") | |
| model = AutoModelForCausalLM.from_pretrained( | |
| "your-org/genmobiai-qwen2.5-coder-14b-flutter", | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" | |
| ) | |
| messages = [ | |
| {"role": "system", "content": "You are GenMobiAi, an expert Flutter developer."}, | |
| {"role": "user", "content": "Create a Riverpod provider for a shopping cart."} | |
| ] | |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
| inputs = tokenizer([text], return_tensors="pt").to(model.device) | |
| output = model.generate(**inputs, max_new_tokens=1024, temperature=0.3, top_p=0.9) | |
| print(tokenizer.decode(output[0], skip_special_tokens=True)) | |
| ``` | |
| ### MLX-LM (Apple Silicon, recommended) | |
| ```bash | |
| python -m mlx_lm.generate \ | |
| --model path/to/genmobiai-qwen2.5-coder-14b-flutter \ | |
| --prompt "Write a Flutter Counter widget with SharedPreferences persistence" \ | |
| --max-tokens 1024 \ | |
| --temp 0.3 | |
| ``` | |
| ### vLLM (High-Throughput) | |
| ```python | |
| from vllm import LLM, SamplingParams | |
| llm = LLM("path/to/genmobiai-qwen2.5-coder-14b-flutter", max_model_len=8192) | |
| outputs = llm.generate( | |
| ["<|im_start|>user\nWrite a Flutter auth provider<|im_end|>\n"], | |
| SamplingParams(temperature=0.3, top_p=0.9, max_tokens=1024) | |
| ) | |
| print(outputs[0].outputs[0].text) | |
| ``` | |
| ### Ollama | |
| ```bash | |
| # Convert to GGUF first | |
| python -m llama_cpp.server --model path/genmobiai-q4_k_m.gguf --port 8000 | |
| # Or use Modelfile | |
| ollama create genmobiai -f - <<EOF | |
| FROM ./genmobiai-q4_k_m.gguf | |
| SYSTEM "You are GenMobiAi, an expert Flutter developer." | |
| PARAMETER temperature 0.3 | |
| PARAMETER top_p 0.9 | |
| EOF | |
| ollama run genmobiai "Build a Flutter provider for authentication" | |
| ``` | |
| ## Recommended Sampling Parameters | |
| | Use Case | Temperature | Top-P | Top-K | Repetition Penalty | | |
| |----------|------------|-------|-------|-------------------| | |
| | Code Generation | 0.3 | 0.9 | 40 | 1.05 | | |
| | Complex Logic | 0.5 | 0.95 | 50 | 1.0 | | |
| | Agentic Output | 0.2 | 0.85 | 40 | 1.1 | | |
| | Creative Patterns | 0.7 | 0.95 | 50 | 0.95 | | |
| ## Model Specifications | |
| ### Architecture | |
| - **Model Type**: Qwen2ForCausalLM | |
| - **Hidden Size**: 5,120 | |
| - **Intermediate Size**: 13,824 | |
| - **Num Layers**: 48 | |
| - **Num Attention Heads**: 40 | |
| - **Num KV Heads**: 8 | |
| - **RoPE Theta**: 1,000,000 | |
| - **Max Position Embeddings**: 128,000 | |
| ### Tokenizer | |
| - **Type**: Qwen2Tokenizer | |
| - **Vocab Size**: 152,064 | |
| - **EOS Token**: `<|im_end|>` (151645) | |
| - **PAD Token**: `<|endoftext|>` (151643) | |
| - **Special Tokens**: ChatML (`<|im_start|>`, `<|im_end|>`) + tool-call markers | |
| ### Quantization (MLX) | |
| - **Bits**: 4 | |
| - **Group Size**: 64 | |
| - **Reduces Size**: ~28GB (BF16) β ~8.3GB (4-bit) | |
| ## Training Configuration | |
| **Dataset**: 311 Flutter/Dart samples (279 train / 32 eval) | |
| **Method**: QLoRA via MLX-LM on Apple Silicon | |
| **LoRA Rank**: 8 | |
| **Trainable Layers**: 16 of 48 | |
| **Batch Size**: 1 | **Grad Accumulation**: 2 | |
| **Learning Rate**: 1e-5 | |
| **Max Seq Length**: 1,024 | |
| **Iterations**: 1,000 | |
| **Estimated Training Time**: 4β8 hours (M3/M4 24GB) | |
| ## Hardware Requirements | |
| | Hardware | Memory | Inference Speed | Use Case | | |
| |----------|--------|-----------------|----------| | |
| | Apple M3/M4 (MLX) | 16GB+ | 100+ tok/s @ 4K | Development | | |
| | RTX 4090 (BF16) | 24GB | 200+ tok/s | Production | | |
| | H100 (batched) | 80GB | 1000+ tok/s | Server | | |
| | CPU (GGUF Q4) | 32GB | 10β15 tok/s | Edge | | |
| ## Capabilities & Use Cases | |
| ### Flutter Development | |
| - β Widget scaffolding (Material 3, Cupertino, adaptive) | |
| - β State management patterns (Provider, Riverpod, GetX, BLoC) | |
| - β REST API integration (Dio, http, interceptors) | |
| - β Local storage (hive, shared_preferences, file I/O) | |
| - β Testing (widget tests, unit tests, integration tests) | |
| - β Platform channels & native integration | |
| ### Code Quality | |
| - Null safety best practices | |
| - MVVM + Clean Architecture patterns | |
| - Error handling & logging | |
| - Performance optimization tips | |
| - Documentation & inline comments | |
| ### Agentic Features | |
| - Tool-call support via XML-wrapped JSON | |
| - Multi-message context preservation | |
| - Chat template integration (ChatML) | |
| - LangGraph workflow compatibility | |
| ## Limitations | |
| 1. **Dataset Size**: 311 samples may cause hallucinations on less-documented packages | |
| 2. **Quantization Artifacts**: 4-bit rounding in floating-point operations | |
| 3. **Vision Tokens**: Vocabulary includes image tokens (inactive) from multimodal base | |
| 4. **Context in Practice**: MLX 4-bit inference optimal at 4Kβ8K tokens on 24GB | |
| 5. **No Formal Benchmarks**: Performance validated empirically, not on standard evals | |
| 6. **Dart 3+ Features**: records, sealed classes partially covered | |
| ## Special Tokens | |
| ``` | |
| <|endoftext|> (ID: 151643) β Padding / Fallback EOS | |
| <|im_start|> (ID: 151644) β ChatML message start | |
| <|im_end|> (ID: 151645) β ChatML message end (Primary EOS) | |
| <tool_call> (Custom) β Agentic tool invocation (XML wrapper) | |
| </tool_call> (Custom) β Agentic tool response end | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @misc{genmobiai2025, | |
| title = {GenMobiAi: Qwen2.5-Coder-14B Fine-tuned for Flutter/Dart Development}, | |
| author = {GenMobiAi Contributors}, | |
| year = {2025}, | |
| url = {https://huggingface.co/your-org/genmobiai-qwen2.5-coder-14b-flutter}, | |
| license = {Apache 2.0} | |
| } | |
| @misc{qwen2_5_coder, | |
| title = {Qwen2.5-Coder: A Capable Code Language Model}, | |
| author = {Alibaba Cloud}, | |
| year = {2024}, | |
| url = {https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct} | |
| } | |
| ``` | |
| ## License | |
| This model is licensed under the **Apache License 2.0**. | |
| - **Base Model**: Qwen2.5-Coder-14B-Instruct by Alibaba Cloud (Apache 2.0) | |
| - **Fine-tuning & Specialization**: GenMobiAi Contributors (Apache 2.0) | |
| - **Training Data**: flutter.dev (BSD 3-Clause), pub.dev packages (per-package), Flutter GitHub (BSD 3-Clause) | |
| See [LICENSE](./LICENSE) for full text. | |
| ## Contributing | |
| Issues or improvements? | |
| - Report on [GitHub](https://github.com/your-org/genmobiai) or [HF Hub](https://huggingface.co/your-org/genmobiai-qwen2.5-coder-14b-flutter) | |
| - Submit Flutter patterns to expand the training dataset | |
| - Improve documentation | |
| --- | |
| **Last Updated**: 2025-05-25 | |
| **Status**: Production-Ready | |
| **Framework Support**: Transformers, MLX-LM, vLLM, llama.cpp, Ollama |