Instructions to use keras/qwen3_coder_instruct_30b_a3b_en with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- KerasHub
How to use keras/qwen3_coder_instruct_30b_a3b_en with KerasHub:
import keras_hub # Load CausalLM model (optional: use half precision for inference) causal_lm = keras_hub.models.CausalLM.from_preset("hf://keras/qwen3_coder_instruct_30b_a3b_en", dtype="bfloat16") causal_lm.compile(sampler="greedy") # (optional) specify a sampler # Generate text causal_lm.generate("Keras: deep learning for", max_length=64)import keras_hub # Create a Backbone model unspecialized for any task backbone = keras_hub.models.Backbone.from_preset("hf://keras/qwen3_coder_instruct_30b_a3b_en") - Keras
How to use keras/qwen3_coder_instruct_30b_a3b_en with Keras:
# Available backend options are: "jax", "torch", "tensorflow". import os os.environ["KERAS_BACKEND"] = "jax" import keras model = keras.saving.load_model("hf://keras/qwen3_coder_instruct_30b_a3b_en") - Notebooks
- Google Colab
- Kaggle
Update README.md with model card content
Browse files
README.md
CHANGED
|
@@ -2,32 +2,74 @@
|
|
| 2 |
library_name: keras-hub
|
| 3 |
pipeline_tag: text-generation
|
| 4 |
---
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
* **
|
| 13 |
-
|
| 14 |
-
* **
|
| 15 |
-
|
| 16 |
-
* **
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
*
|
| 26 |
-
*
|
| 27 |
-
*
|
| 28 |
-
*
|
| 29 |
-
*
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
library_name: keras-hub
|
| 3 |
pipeline_tag: text-generation
|
| 4 |
---
|
| 5 |
+
### Model Overview
|
| 6 |
+
# Model Summary
|
| 7 |
+
|
| 8 |
+
Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Both language models and multimodal models are pretrained on large-scale multilingual and multimodal data and post-trained on quality data for aligning to human preferences. Qwen is capable of natural language understanding, text generation, vision understanding, audio understanding, tool use, role play, playing as AI agent, etc.
|
| 9 |
+
|
| 10 |
+
Qwen3-Coder model maintains impressive performance and efficiency, featuring the following key enhancements:
|
| 11 |
+
|
| 12 |
+
* **Significant Performance** among open models on **Agentic Coding, Agentic Browser-Use**, and other foundational coding tasks.
|
| 13 |
+
|
| 14 |
+
* **Long-context Capabilities** with native support for **256K** tokens, extendable up to **1M** tokens using Yarn, optimized for repository-scale understanding.
|
| 15 |
+
|
| 16 |
+
* **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.
|
| 17 |
+
|
| 18 |
+
For more details, please refer to Qwen [Blog](https://qwenlm.github.io/blog/qwen3/), [GitHub](https://github.com/keras-team/keras-hub/tree/master/keras_hub/src/models/qwen_moe), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
| 19 |
+
|
| 20 |
+
Weights are released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE) . Keras model code is released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE).
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
## Links
|
| 24 |
+
|
| 25 |
+
* [Qwen 3 Coder Quickstart Notebook](Coming Soon!!)
|
| 26 |
+
* [Qwen 3 Coder API Documentation](https://keras.io/keras_hub/api/models/qwen3_moe/)
|
| 27 |
+
* [Qwen 3 Coder Model Card](https://qwenlm.github.io/blog/qwen3/)
|
| 28 |
+
* [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/)
|
| 29 |
+
* [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/)
|
| 30 |
+
|
| 31 |
+
## Installation
|
| 32 |
+
|
| 33 |
+
Keras and KerasHub can be installed with:
|
| 34 |
+
|
| 35 |
+
```
|
| 36 |
+
pip install -U -q keras-hub
|
| 37 |
+
pip install -U -q keras
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page.
|
| 41 |
+
|
| 42 |
+
## Available Qwen 3 Coder Presets
|
| 43 |
+
The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
|
| 44 |
+
| Preset | Parameters | Description |
|
| 45 |
+
|--------|------------|-------------|
|
| 46 |
+
| `qwen3_coder_instruct_30b_a3b_en` | 30B | Code-Specific Model, Mixture-of-Experts (MoE) model has 30.5B billion total parameters, with 3.3B billion activated, built on 48 layers, and utilizes 32 query and 4 key/value attention heads with 128 experts (8 active).|
|
| 47 |
+
|
| 48 |
+
## Example Usage
|
| 49 |
+
```Python
|
| 50 |
+
|
| 51 |
+
import keras
|
| 52 |
+
import keras_hub
|
| 53 |
+
import numpy as np
|
| 54 |
+
|
| 55 |
+
# Use generate() for code generation.
|
| 56 |
+
qwen_lm = keras_hub.models.QwenMoeCausalLM.from_preset("qwen3_coder_instruct_30b_a3b_en")
|
| 57 |
+
qwen_lm.generate(" write a quick sort algorithm in python.", max_length=512)
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
## Example Usage with Hugging Face URI
|
| 63 |
+
|
| 64 |
+
```Python
|
| 65 |
+
|
| 66 |
+
import keras
|
| 67 |
+
import keras_hub
|
| 68 |
+
import numpy as np
|
| 69 |
+
|
| 70 |
+
# Use generate() for code generation.
|
| 71 |
+
qwen_lm = keras_hub.models.QwenMoeCausalLM.from_preset("hf://keras/qwen3_coder_instruct_30b_a3b_en")
|
| 72 |
+
qwen_lm.generate(" write a quick sort algorithm in python.", max_length=512)
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
```
|