Update README.md with new model card content
Browse files
README.md
CHANGED
|
@@ -2,25 +2,74 @@
|
|
| 2 |
library_name: keras-hub
|
| 3 |
pipeline_tag: text-generation
|
| 4 |
---
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
*
|
| 21 |
-
*
|
| 22 |
-
*
|
| 23 |
-
*
|
| 24 |
-
*
|
| 25 |
-
|
| 26 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
library_name: keras-hub
|
| 3 |
pipeline_tag: text-generation
|
| 4 |
---
|
| 5 |
+
### Model Overview
|
| 6 |
+
# Model Summary
|
| 7 |
+
|
| 8 |
+
Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Both language models and multimodal models are pretrained on large-scale multilingual and multimodal data and post-trained on quality data for aligning to human preferences. Qwen is capable of natural language understanding, text generation, vision understanding, audio understanding, tool use, role play, playing as AI agent, etc.
|
| 9 |
+
|
| 10 |
+
Unlike Qwen2-Math series which only supports using Chain-of-Thught (CoT) to solve English math problems, Qwen2.5-Math series is expanded to support using both CoT and Tool-integrated Reasoning (TIR) to solve math problems in both Chinese and English. The Qwen2.5-Math series models have achieved significant performance improvements compared to the Qwen2-Math series models on the Chinese and English mathematics benchmarks with CoT.
|
| 11 |
+
|
| 12 |
+
While CoT plays a vital role in enhancing the reasoning capabilities of LLMs, it faces challenges in achieving computational accuracy and handling complex mathematical or algorithmic reasoning tasks, such as finding the roots of a quadratic equation or computing the eigenvalues of a matrix. TIR can further improve the model's proficiency in precise computation, symbolic manipulation, and algorithmic manipulation.
|
| 13 |
+
|
| 14 |
+
For more details, please refer to Qwen [Blog](https://qwenlm.github.io/blog/qwen2.5/), [GitHub](https://github.com/keras-team/keras-hub/tree/master/keras_hub/src/models/qwen), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
| 15 |
+
|
| 16 |
+
Weights are released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE) . Keras model code is released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE).
|
| 17 |
+
|
| 18 |
+
## Links
|
| 19 |
+
|
| 20 |
+
* [Qwen 2.5 Math Quickstart Notebook](https://www.kaggle.com/code/laxmareddypatlolla/qwen2-5-math-quick-start-notebook)
|
| 21 |
+
* [Qwen 2.5 Math API Documentation](https://keras.io/keras_hub/api/models/qwen/)
|
| 22 |
+
* [Qwen 2.5 Math Model Card](https://qwenlm.github.io/blog/qwen2.5/)
|
| 23 |
+
* [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/)
|
| 24 |
+
* [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/)
|
| 25 |
+
|
| 26 |
+
## Installation
|
| 27 |
+
|
| 28 |
+
Keras and KerasHub can be installed with:
|
| 29 |
+
|
| 30 |
+
```
|
| 31 |
+
pip install -U -q keras-hub
|
| 32 |
+
pip install -U -q keras
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page.
|
| 36 |
+
|
| 37 |
+
## Presets
|
| 38 |
+
|
| 39 |
+
The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
|
| 40 |
+
|
| 41 |
+
| Preset name | Parameters | Description |
|
| 42 |
+
|---------------------------------------|------------|--------------------------------------------------------------------------------------------------------------|
|
| 43 |
+
| `qwen2.5_math_1.5b_en` | 1.5B | 28-layer Qwen model with 1.5 billion parameters. |
|
| 44 |
+
| `qwen2.5_math_instruct_1.5b_en` | 1.5B | 28-layer Qwen model with 1.5 billion parameters. Instruction tuned. |
|
| 45 |
+
| `qwen2.5_math_7b_en` | 7B | 28-layer Qwen model with 7 billion parameters. |
|
| 46 |
+
| `qwen2.5_math_instruct_7b_en` | 7B | 28-layer Qwen model with 7 billion parameters. Instruction tuned. |
|
| 47 |
+
|
| 48 |
+
## Example Usage
|
| 49 |
+
```Python
|
| 50 |
+
|
| 51 |
+
import keras
|
| 52 |
+
import keras_hub
|
| 53 |
+
import numpy as np
|
| 54 |
+
|
| 55 |
+
# Use generate() to do code generation.
|
| 56 |
+
qwen_lm = keras_hub.models.QwenCausalLM.from_preset("qwen2.5_math_7b_en")
|
| 57 |
+
qwen_lm.generate(" Find the value of x that satisfies the equation 4x+5 = 6x+7.", max_length=300)
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
## Example Usage with Hugging Face URI
|
| 63 |
+
|
| 64 |
+
```Python
|
| 65 |
+
|
| 66 |
+
import keras
|
| 67 |
+
import keras_hub
|
| 68 |
+
import numpy as np
|
| 69 |
+
|
| 70 |
+
# Use generate() to do code generation.
|
| 71 |
+
qwen_lm = keras_hub.models.QwenCausalLM.from_preset("hf://keras/qwen2.5_math_7b_en")
|
| 72 |
+
qwen_lm.generate(" Find the value of x that satisfies the equation 4x+5 = 6x+7.", max_length=300)
|
| 73 |
+
|
| 74 |
+
|
| 75 |
+
```
|