Model Overview
Model Summary
Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Both language models and multimodal models are pretrained on large-scale multilingual and multimodal data and post-trained on quality data for aligning to human preferences. Qwen is capable of natural language understanding, text generation, vision understanding, audio understanding, tool use, role play, playing as AI agent, etc.
Qwen3-Coder model maintains impressive performance and efficiency, featuring the following key enhancements:
Significant Performance among open models on Agentic Coding, Agentic Browser-Use, and other foundational coding tasks.
Long-context Capabilities with native support for 256K tokens, extendable up to 1M tokens using Yarn, optimized for repository-scale understanding.
Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.
For more details, please refer to Qwen Blog, GitHub, and Documentation.
Weights are released under the Apache 2 License . Keras model code is released under the Apache 2 License.
Links
- [Qwen 3 Coder Quickstart Notebook](Coming Soon!!)
- Qwen 3 Coder API Documentation
- Qwen 3 Coder Model Card
- KerasHub Beginner Guide
- KerasHub Model Publishing Guide
Installation
Keras and KerasHub can be installed with:
pip install -U -q keras-hub
pip install -U -q keras
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.
Available Qwen 3 Coder Presets
The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
| Preset | Parameters | Description |
|---|---|---|
qwen3_coder_instruct_30b_a3b_en |
30B | Code-Specific Model, Mixture-of-Experts (MoE) model has 30.5B billion total parameters, with 3.3B billion activated, built on 48 layers, and utilizes 32 query and 4 key/value attention heads with 128 experts (8 active). |
Example Usage
import keras
import keras_hub
import numpy as np
# Use generate() for code generation.
qwen_lm = keras_hub.models.QwenMoeCausalLM.from_preset("qwen3_coder_instruct_30b_a3b_en")
qwen_lm.generate(" write a quick sort algorithm in python.", max_length=512)
Example Usage with Hugging Face URI
import keras
import keras_hub
import numpy as np
# Use generate() for code generation.
qwen_lm = keras_hub.models.QwenMoeCausalLM.from_preset("hf://keras/qwen3_coder_instruct_30b_a3b_en")
qwen_lm.generate(" write a quick sort algorithm in python.", max_length=512)
- Downloads last month
- 34