star-staff commited on
Commit
cc429ad
·
verified ·
1 Parent(s): 30e2f3d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -9
README.md CHANGED
@@ -4,13 +4,12 @@ language:
4
  - en
5
  - zh
6
  pipeline_tag: text-generation
7
- base_model: Qwen/Qwen3-4b
8
  tags:
9
  - chat
10
  - function-calling
11
  - tool-use
12
  - star-method
13
- - sota
14
  library_name: transformers
15
  ---
16
 
@@ -18,23 +17,21 @@ library_name: transformers
18
 
19
  ## Introduction
20
 
21
- **STAR-4b** is a highly capable 4b parameter language model specialized in function calling, achieving excellent performances on the [Berkeley Function Calling Leaderboard (BFCL)](https://huggingface.co/spaces/gorilla-llm/berkeley-function-calling-leaderboard) for models in its size class.
22
 
23
- This model is the result of fine-tuning the `Qwen/Qwen3-4b` base model using the novel **STAR (Similarity-guided Teacher-Assisted Refinement)** framework. STAR is a holistic training curriculum designed to effectively transfer the advanced capabilities of large language models (LLMs) into "super-tiny" models, making them powerful, accessible, and efficient for real-world agentic applications.
24
 
25
  The key innovations of the STAR framework include:
26
  - **Similarity-guided RL (Sim-RL)**: A reinforcement learning mechanism that uses a fine-grained, similarity-based reward signal. This provides a more robust and continuous signal for policy optimization compared to simple binary rewards, which is crucial for complex, multi-solution tasks like function calling.
27
  - **Constrained Knowledge Distillation (CKD)**: An advanced training objective that augments top-k forward KL divergence to suppress confidently incorrect predictions. This ensures training stability while preserving the model's exploration capacity, creating a strong foundation for the subsequent RL phase.
28
 
29
- Our STAR-4b model significantly outperforms other open models under 1B parameters and even surpasses several larger models, demonstrating the effectiveness of the STAR methodology.
30
-
31
  ## Model Details
32
 
33
  - **Model Type**: Causal Language Model, fine-tuned for function calling.
34
- - **Base Model**: `Qwen/Qwen3-4b`
35
  - **Training Framework**: STAR (CKD + Sim-RL)
36
  - **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.
37
- - **Number of Parameters**: ~4b
38
  - **Context Length**: Supports up to 32,768 tokens.
39
 
40
  ## Requirements
@@ -100,7 +97,7 @@ For local use, applications such as Ollama, LMStudio, MLX-LM, llama.cpp, and KTr
100
 
101
  ## Evaluation & Performance
102
 
103
- STAR-4b has achieved outstanding performance for models of its size on renowned function calling benchmarks.
104
 
105
  - BFCLv3: Achieved 65.24% overall accuracy.
106
  - ACEBench: Achieved 74.10% summary score, demonstrating superior generalization and robustness.
 
4
  - en
5
  - zh
6
  pipeline_tag: text-generation
7
+ base_model: Qwen/Qwen3-4B
8
  tags:
9
  - chat
10
  - function-calling
11
  - tool-use
12
  - star-method
 
13
  library_name: transformers
14
  ---
15
 
 
17
 
18
  ## Introduction
19
 
20
+ **STAR-4b** is a highly capable 4B parameter language model specialized in function calling.
21
 
22
+ This model is the result of fine-tuning the `Qwen/Qwen3-4B` base model using the novel **STAR (Similarity-guided Teacher-Assisted Refinement)** framework. STAR is a holistic training curriculum designed to effectively transfer the advanced capabilities of large language models (LLMs) into "super-tiny" models, making them powerful, accessible, and efficient for real-world agentic applications.
23
 
24
  The key innovations of the STAR framework include:
25
  - **Similarity-guided RL (Sim-RL)**: A reinforcement learning mechanism that uses a fine-grained, similarity-based reward signal. This provides a more robust and continuous signal for policy optimization compared to simple binary rewards, which is crucial for complex, multi-solution tasks like function calling.
26
  - **Constrained Knowledge Distillation (CKD)**: An advanced training objective that augments top-k forward KL divergence to suppress confidently incorrect predictions. This ensures training stability while preserving the model's exploration capacity, creating a strong foundation for the subsequent RL phase.
27
 
 
 
28
  ## Model Details
29
 
30
  - **Model Type**: Causal Language Model, fine-tuned for function calling.
31
+ - **Base Model**: `Qwen/Qwen3-4B`
32
  - **Training Framework**: STAR (CKD + Sim-RL)
33
  - **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.
34
+ - **Number of Parameters**: ~4B
35
  - **Context Length**: Supports up to 32,768 tokens.
36
 
37
  ## Requirements
 
97
 
98
  ## Evaluation & Performance
99
 
100
+ STAR-4b has achieved outstanding performance on renowned function calling benchmarks.
101
 
102
  - BFCLv3: Achieved 65.24% overall accuracy.
103
  - ACEBench: Achieved 74.10% summary score, demonstrating superior generalization and robustness.