Qwen-7B
๐ค Hugging Face | ๐ค ModelScope | ๐ Paper ๏ฝ ๐ฅ๏ธ Demo
WeChat (ๅพฎไฟก) | Discord ๏ฝ API
ไป็ป (Introduction)
้ไนๅ้ฎ-7B๏ผQwen-7B๏ผๆฏ้ฟ้ไบ็ ๅ็้ไนๅ้ฎๅคงๆจกๅ็ณปๅ็70ไบฟๅๆฐ่งๆจก็ๆจกๅใQwen-7BๆฏๅบไบTransformer็ๅคง่ฏญ่จๆจกๅ, ๅจ่ถ ๅคง่งๆจก็้ข่ฎญ็ปๆฐๆฎไธ่ฟ่ก่ฎญ็ปๅพๅฐใ้ข่ฎญ็ปๆฐๆฎ็ฑปๅๅคๆ ท๏ผ่ฆ็ๅนฟๆณ๏ผๅ ๆฌๅคง้็ฝ็ปๆๆฌใไธไธไนฆ็ฑใไปฃ็ ็ญใๅๆถ๏ผๅจQwen-7B็ๅบ็กไธ๏ผๆไปฌไฝฟ็จๅฏน้ฝๆบๅถๆ้ ไบๅบไบๅคง่ฏญ่จๆจกๅ็AIๅฉๆQwen-7B-Chatใ็ธ่พไบๆๅๅผๆบ็Qwen-7Bๆจกๅ๏ผๆไปฌ็ฐๅทฒๅฐ้ข่ฎญ็ปๆจกๅๅChatๆจกๅๆดๆฐๅฐๆๆๆดไผ็็ๆฌใๆฌไปๅบไธบQwen-7B้ข่ฎญ็ปๆจกๅ็ไปๅบใ
้ไนๅ้ฎ-7B๏ผQwen-7B๏ผไธป่ฆๆไปฅไธ็น็น๏ผ
- ๅคง่งๆจก้ซ่ดจ้่ฎญ็ป่ฏญๆ๏ผไฝฟ็จ่ถ ่ฟ2.4ไธไบฟtokens็ๆฐๆฎ่ฟ่ก้ข่ฎญ็ป๏ผๅ ๅซ้ซ่ดจ้ไธญใ่ฑใๅค่ฏญ่จใไปฃ็ ใๆฐๅญฆ็ญๆฐๆฎ๏ผๆถต็้็จๅไธไธ้ขๅ็่ฎญ็ป่ฏญๆใ้่ฟๅคง้ๅฏนๆฏๅฎ้ชๅฏน้ข่ฎญ็ป่ฏญๆๅๅธ่ฟ่กไบไผๅใ
- ๅผบๅคง็ๆง่ฝ๏ผQwen-7Bๅจๅคไธชไธญ่ฑๆไธๆธธ่ฏๆตไปปๅกไธ๏ผๆถต็ๅธธ่ฏๆจ็ใไปฃ็ ใๆฐๅญฆใ็ฟป่ฏ็ญ๏ผ๏ผๆๆๆพ่่ถ ่ถ็ฐๆ็็ธ่ฟ่งๆจกๅผๆบๆจกๅ๏ผ็่ณๅจ้จๅๆๆ ไธ็ธๆฏๆดๅคงๅฐบๅฏธๆจกๅไนๆ่พๅผบ็ซไบๅใๅ ทไฝ่ฏๆต็ปๆ่ฏท่ฏฆ่งไธๆใ
- ่ฆ็ๆดๅ จ้ข็่ฏ่กจ๏ผ็ธๆฏ็ฎๅไปฅไธญ่ฑ่ฏ่กจไธบไธป็ๅผๆบๆจกๅ๏ผQwen-7Bไฝฟ็จไบ็บฆ15ไธๅคงๅฐ็่ฏ่กจใ่ฏฅ่ฏ่กจๅฏนๅค่ฏญ่จๆดๅ ๅๅฅฝ๏ผๆนไพฟ็จๆทๅจไธๆฉๅฑ่ฏ่กจ็ๆ ๅตไธๅฏน้จๅ่ฏญ็ง่ฟ่ก่ฝๅๅขๅผบๅๆฉๅฑใ
ๅฆๆๆจๆณไบ่งฃๆดๅคๅ ณไบ้ไนๅ้ฎ7Bๅผๆบๆจกๅ็็ป่๏ผๆไปฌๅปบ่ฎฎๆจๅ้ GitHubไปฃ็ ๅบใ
Qwen-7B is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-7B, we release Qwen-7B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. Now we have updated both our pretrained and chat models for better performances. This repository is the one for the Qwen-7B base language model.
The features of Qwen-7B include:
- Large-scale high-quality training corpora: It is pretrained on over 2.4 trillion tokens, including Chinese, English, multilingual texts, code, and mathematics, covering general and professional fields. The distribution of the pre-training corpus has been optimized through a large number of ablation experiments.
- Competitive performance: It significantly surpasses existing open-source models of similar scale on multiple Chinese and English downstream evaluation tasks (including commonsense, reasoning, code, mathematics, etc.), and even surpasses some larger-scale models in several benchmarks. See below for specific evaluation results.
- More comprehensive vocabulary coverage: Compared with other open-source models based on Chinese and English vocabularies, Qwen-7B uses a vocabulary of over 150K tokens. This vocabulary is more friendly to multiple languages, enabling users to directly further enhance the capability for certain languages without expanding the vocabulary.
For more details about Qwen, please refer to the GitHub code repository.
่ฆๆฑ๏ผRequirements๏ผ
- python 3.8ๅไปฅไธ็ๆฌ
- pytorch 1.12ๅไปฅไธ็ๆฌ๏ผๆจ่2.0ๅไปฅไธ็ๆฌ
- ๅปบ่ฎฎไฝฟ็จCUDA 11.4ๅไปฅไธ๏ผGPU็จๆทใflash-attention็จๆท็ญ้่่ๆญค้้กน๏ผ
- python 3.8 and above
- pytorch 1.12 and above, 2.0 and above are recommended
- CUDA 11.4 and above are recommended (this is for GPU users, flash-attention users, etc.)
ไพ่ต้กน (Dependency)
่ฟ่กQwen-7B๏ผ่ฏท็กฎไฟๆปก่ถณไธ่ฟฐ่ฆๆฑ๏ผๅๆง่กไปฅไธpipๅฝไปคๅฎ่ฃ ไพ่ตๅบ
To run Qwen-7B, please make sure you meet the above requirements, and then execute the following pip commands to install the dependent libraries.
pip install transformers==4.32.0 accelerate tiktoken einops scipy transformers_stream_generator==0.0.4 peft deepspeed
ๅฆๅค๏ผๆจ่ๅฎ่ฃ
flash-attentionๅบ๏ผๅฝๅๅทฒๆฏๆflash attention 2๏ผ๏ผไปฅๅฎ็ฐๆด้ซ็ๆ็ๅๆดไฝ็ๆพๅญๅ ็จใ
In addition, it is recommended to install the flash-attention library (we support flash attention 2 now.) for higher efficiency and lower memory usage.
git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention && pip install .
# ไธๆนๅฎ่ฃ
ๅฏ้๏ผๅฎ่ฃ
ๅฏ่ฝๆฏ่พ็ผๆ
ขใ
# pip install csrc/layer_norm
# pip install csrc/rotary
ๅฟซ้ไฝฟ็จ๏ผQuickstart๏ผ
ๆจๅฏไปฅ้่ฟไปฅไธไปฃ็ ่ฝปๆพ่ฐ็จ๏ผ
You can easily call the model with the following code:
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation import GenerationConfig
# Note: The default behavior now has injection attack prevention off.
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B", trust_remote_code=True)
# use bf16
# model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B", device_map="auto", trust_remote_code=True, bf16=True).eval()
# use fp16
# model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B", device_map="auto", trust_remote_code=True, fp16=True).eval()
# use cpu only
# model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B", device_map="cpu", trust_remote_code=True).eval()
# use auto mode, automatically select precision based on the device.
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B", device_map="auto", trust_remote_code=True).eval()
# Specify hyperparameters for generation. But if you use transformers>=4.32.0, there is no need to do this.
# model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-7B", trust_remote_code=True)
inputs = tokenizer('่ๅคๅฝ็้ฆ้ฝๆฏไนๅ
ฐๅทดๆ๏ผUlaanbaatar๏ผ\nๅฐๅฒ็้ฆ้ฝๆฏ้ทๅ
้
ๆชๅ
๏ผReykjavik๏ผ\nๅๅกไฟๆฏไบ็้ฆ้ฝๆฏ', return_tensors='pt')
inputs = inputs.to(model.device)
pred = model.generate(**inputs)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
# ่ๅคๅฝ็้ฆ้ฝๆฏไนๅ
ฐๅทดๆ๏ผUlaanbaatar๏ผ\nๅฐๅฒ็้ฆ้ฝๆฏ้ทๅ
้
ๆชๅ
๏ผReykjavik๏ผ\nๅๅกไฟๆฏไบ็้ฆ้ฝๆฏไบ็ๆฏไบ่ดๅทด๏ผAddis Ababa๏ผ...
ๅ ณไบๆดๅค็ไฝฟ็จ่ฏดๆ๏ผ่ฏทๅ่ๆไปฌ็GitHub repo่ทๅๆดๅคไฟกๆฏใ
For more information, please refer to our GitHub repo for more information.
Tokenizer
ๆณจ๏ผไฝไธบๆฏ่ฏญ็โtokenizationโๅจไธญๆไธญๅฐๆ ๅ ฑ่ฏ็ๆฆๅฟตๅฏนๅบ๏ผๆฌๆๆกฃ้็จ่ฑๆ่กจ่พพไปฅๅฉ่ฏดๆใ
ๅบไบtiktoken็ๅ่ฏๅจๆๅซไบๅ ถไปๅ่ฏๅจ๏ผๆฏๅฆsentencepieceๅ่ฏๅจใๅฐคๅ ถๅจๅพฎ่ฐ้ถๆฎต๏ผ้่ฆ็นๅซๆณจๆ็นๆฎtoken็ไฝฟ็จใๅ ณไบtokenizer็ๆดๅคไฟกๆฏ๏ผไปฅๅๅพฎ่ฐๆถๆถๅ็็ธๅ ณไฝฟ็จ๏ผ่ฏทๅ้ ๆๆกฃใ
Our tokenizer based on tiktoken is different from other tokenizers, e.g., sentencepiece tokenizer. You need to pay attention to special tokens, especially in finetuning. For more detailed information on the tokenizer and related use in fine-tuning, please refer to the documentation.
ๆจกๅ็ป่ (Model)
Qwen-7Bๆจกๅ่งๆจกๅบๆฌๆ ๅตๅฆไธๆ็คบใ
The details of the model architecture of Qwen-7B are listed as follows.
| Hyperparameter | Value |
|---|---|
| n_layers | 32 |
| n_heads | 32 |
| d_model | 4096 |
| vocab size | 151851 |
| sequence length | 8192 |
ๅจไฝ็ฝฎ็ผ็ ใFFNๆฟๆดปๅฝๆฐๅnormalization็ๅฎ็ฐๆนๅผไธ๏ผๆไปฌไน้็จไบ็ฎๅๆๆต่ก็ๅๆณ๏ผ ๅณRoPE็ธๅฏนไฝ็ฝฎ็ผ็ ใSwiGLUๆฟๆดปๅฝๆฐใRMSNorm๏ผๅฏ้ๅฎ่ฃ flash-attentionๅ ้๏ผใ
ๅจๅ่ฏๅจๆน้ข๏ผ็ธๆฏ็ฎๅไธปๆตๅผๆบๆจกๅไปฅไธญ่ฑ่ฏ่กจไธบไธป๏ผQwen-7Bไฝฟ็จไบ่ถ
่ฟ15ไธtokenๅคงๅฐ็่ฏ่กจใ ่ฏฅ่ฏ่กจๅจGPT-4ไฝฟ็จ็BPE่ฏ่กจcl100k_baseๅบ็กไธ๏ผๅฏนไธญๆใๅค่ฏญ่จ่ฟ่กไบไผๅ๏ผๅจๅฏนไธญใ่ฑใไปฃ็ ๆฐๆฎ็้ซๆ็ผ่งฃ็ ็ๅบ็กไธ๏ผๅฏน้จๅๅค่ฏญ่จๆดๅ ๅๅฅฝ๏ผๆนไพฟ็จๆทๅจไธๆฉๅฑ่ฏ่กจ็ๆ
ๅตไธๅฏน้จๅ่ฏญ็ง่ฟ่ก่ฝๅๅขๅผบใ
่ฏ่กจๅฏนๆฐๅญๆๅไธชๆฐๅญไฝๅๅใ่ฐ็จ่พไธบ้ซๆ็tiktokenๅ่ฏๅบ่ฟ่กๅ่ฏใ
ๆไปฌไป้จๅ่ฏญ็งๅ้ๆบๆฝๅ100ไธไธชๆๆกฃ่ฏญๆ๏ผไปฅๅฏนๆฏไธๅๆจกๅ็็ผ็ ๅ็ผฉ็๏ผไปฅๆฏๆ100่ฏญ็ง็XLM-Rไธบๅบๅๅผ1๏ผ่ถไฝ่ถๅฅฝ๏ผ๏ผๅ ทไฝๆง่ฝ่งๅพใ
ๅฏไปฅ็ๅฐQwen-7Bๅจไฟๆไธญ่ฑไปฃ็ ้ซๆ่งฃ็ ็ๅๆไธ๏ผๅฏน้จๅไฝฟ็จไบบ็พค่พๅค็่ฏญ็ง๏ผๆณฐ่ฏญthใๅธไผฏๆฅ่ฏญheใ้ฟๆไผฏ่ฏญarใ้ฉ่ฏญkoใ่ถๅ่ฏญviใๆฅ่ฏญjaใๅ่ณๅ ถ่ฏญtrใๅฐๅฐผ่ฏญidใๆณขๅ ฐ่ฏญplใไฟ่ฏญruใ่ทๅ ฐ่ฏญnlใ่ก่็่ฏญptใๆๅคงๅฉ่ฏญitใๅพท่ฏญdeใ่ฅฟ็ญ็่ฏญesใๆณ่ฏญfr็ญ๏ผไธไนๅฎ็ฐไบ่พ้ซ็ๅ็ผฉ็๏ผไฝฟๅพๆจกๅๅจ่ฟไบ่ฏญ็งไธไนๅ ทๅค่พๅผบ็ๅฏๆฉๅฑๆงๅ่พ้ซ็่ฎญ็ปๅๆจ็ๆ็ใ
ๅจ้ข่ฎญ็ปๆฐๆฎๆน้ข๏ผๅป้ๅ่ฟๆปคๅ็่ฏญๆ่ถ ่ฟ2.4T tokens๏ผๅๆฌๅ จ็ฝๆๆฌใ็พ็งใไนฆ็ฑใไปฃ็ ใๆฐๅญฆๅๅไธช้ขๅๅ็ฑปใ
For position encoding, FFN activation function, and normalization methods, we adopt the prevalent practices, i.e., RoPE relative position encoding, SwiGLU for activation function, and RMSNorm for normalization (optional installation of flash-attention for acceleration).
For tokenization, compared to the current mainstream open-source models based on Chinese and English vocabularies, Qwen-7B uses a vocabulary of over 150K tokens. It first considers efficient encoding of Chinese, English, and code data, and is also more friendly to multilingual languages, enabling users to directly enhance the capability of some languages without expanding the vocabulary. It segments numbers by single digit, and calls the tiktoken tokenizer library for efficient tokenization.
We randomly selected 1 million document corpus of each language to test and compare the encoding compression rates of different models (with XLM-R, which supports 100 languages, as the base value 1). The specific performance is shown in the figure above.
As can be seen, while ensuring the efficient decoding of Chinese, English, and code, Qwen-7B also achieves a high compression rate for many other languages (such as th, he, ar, ko, vi, ja, tr, id, pl, ru, nl, pt, it, de, es, fr etc.), equipping the model with strong scalability as well as high training and inference efficiency in these languages.
The scale of pretraining corpus reaches over 2.4T tokens after deduplication and filtration, encompassing web text, encyclopedia, books, code, mathematics, and various domains.
่ฏๆตๆๆ๏ผEvaluation๏ผ
ๆไปฌ้ๅไบMMLU๏ผC-Eval๏ผGSM8K, MATH, HumanEval, MBPP, BBH, CMMLU็ญ็ฎๅ่พๆต่ก็benchmark๏ผๅฏนๆจกๅ็ไธญ่ฑ็ฅ่ฏ่ฝๅใ็ฟป่ฏใๆฐๅญฆๆจ็ใไปฃ็ ็ญ่ฝๅ่ฟ่ก็ปผๅ่ฏๆตใไปไธๅ็ปๆๅฏไปฅ็ๅฐQwenๆจกๅๅจๆๆbenchmarkไธๅๅๅพไบๅ็บงๅซๅผๆบๆจกๅไธญ็ๆไผ่กจ็ฐใ
We selected MMLU, C-Eval, GSM8K, MATH, HumanEval, MBPP, BBH, CMMLU, which are currently popular benchmarks, to test the modelโs Chinese and English knowledge capabilities, translation, mathematical reasoning, coding and other capabilities. From the following comprehensive evaluation results, we can see that the Qwen model outperform the similarly sized open-source models on all tasks.
| Model | MMLU | C-Eval | GSM8K | MATH | HumanEval | MBPP | BBH | CMMLU |
|---|---|---|---|---|---|---|---|---|
| 5-shot | 5-shot | 8-shot | 4-shot | 0-shot | 3-shot | 3-shot | 5-shot | |
| LLaMA2-7B | 46.8 | 32.5 | 16.7 | 3.3 | 12.8 | 20.8 | 38.2 | 31.8 |
| LLaMA2-13B | 55.0 | 41.4 | 29.6 | 5.0 | 18.9 | 30.3 | 45.6 | 38.4 |
| LLaMA2-34B | 62.6 | - | 42.2 | 6.2 | 22.6 | 33.0 | 44.1 | - |
| ChatGLM2-6B | 47.9 | 51.7 | 32.4 | 6.5 | - | - | 33.7 | - |
| InternLM-7B | 51.0 | 53.4 | 31.2 | 6.3 | 10.4 | 14.0 | 37.0 | 51.8 |
| InternLM-20B | 62.1 | 58.8 | 52.6 | 7.9 | 25.6 | 35.6 | 52.5 | 59.0 |
| Baichuan2-7B | 54.7 | 56.3 | 24.6 | 5.6 | 18.3 | 24.2 | 41.6 | 57.1 |
| Baichuan2-13B | 59.5 | 59.0 | 52.8 | 10.1 | 17.1 | 30.2 | 49.0 | 62.0 |
| Qwen-7B (original) | 56.7 | 59.6 | 51.6 | - | 24.4 | 31.2 | 40.6 | 58.8 |
| Qwen-7B | 58.2 | 63.5 | 51.7 | 11.6 | 29.9 | 31.6 | 45.0 | 62.2 |
| Qwen-14B | 66.3 | 72.1 | 61.3 | 24.8 | 32.3 | 40.8 | 53.4 | 71.0 |
้ฟๅบๅ่ฏๆต๏ผLong-Context Evaluation๏ผ
ๆไปฌๅผๅ ฅNTKๆๅผ๏ผLogNๆณจๆๅ็ผฉๆพ๏ผ็ชๅฃๆณจๆๅ็ญๆๅทง๏ผๅฐQwen-7B (original)ๅ14Bๆจกๅ็ไธไธๆ้ฟๅบฆไป2Kๆฉๅฑๅฐ8Kไปฅไธ๏ผๅฐQwen-7Bไป8Kๆฉๅฐ32KใๅจarXivๆฐๆฎไธไฝฟ็จPPLๆๆ ๆต่ฏQwen-7BๅQwen-14Bๅจไธๅ้ฟๅบฆไธ็่กจ็ฐ๏ผ็ปๆๅฆไธ๏ผ
(่ฅ่ฆๅฏ็จNTKๅLogNๆณจๆๅ็ผฉๆพ๏ผ่ฏทๅฐconfig.json้็use_dynamic_ntkๅuse_logn_attn่ฎพ็ฝฎไธบtrue)
We introduce NTK-aware interpolation, LogN attention scaling, Window attention, etc. to extend the context length to over 8K tokens. We conduct language modeling experiments on the arXiv dataset with the PPL evaluation. Results are demonstrated below:
(To use NTK interpolation and LogN scaling, please set use_dynamic_ntk and use_long_attn to true in config.json.)
| Model | Sequence Length | |||||
|---|---|---|---|---|---|---|
| 1024 | 2048 | 4096 | 8192 | 16384 | 32768 | |
| Qwen-7B (original) | 4.23 | 3.78 | 39.35 | 469.81 | 2645.09 | - |
| + dynamic_ntk | 4.23 | 3.78 | 3.59 | 3.66 | 5.71 | - |
| + dynamic_ntk + logn | 4.23 | 3.78 | 3.58 | 3.56 | 4.62 | - |
| + dynamic_ntk + logn + window_attn | 4.23 | 3.78 | 3.58 | 3.49 | 4.32 | - |
| Qwen-7B | 4.23 | 3.81 | 3.52 | 3.31 | 7.27 | 181.49 |
| + dynamic_ntk + logn + window_attn | 4.23 | 3.81 | 3.52 | 3.33 | 3.22 | 3.17 |
| Qwen-14B | - | 3.46 | 22.79 | 334.65 | 3168.35 | - |
| + dynamic_ntk + logn + window_attn | - | 3.46 | 3.29 | 3.18 | 3.42 | - |
่ฏๆตๅค็ฐ๏ผReproduction๏ผ
ๆไปฌๆไพไบ่ฏๆต่ๆฌ๏ผๆนไพฟๅคงๅฎถๅค็ฐๆจกๅๆๆ๏ผ่ฏฆ่ง้พๆฅใๆ็คบ๏ผ็ฑไบ็กฌไปถๅๆกๆถ้ ๆ็่ๅ ฅ่ฏฏๅทฎ๏ผๅค็ฐ็ปๆๅฆๆๅฐๅน ๆณขๅจๅฑไบๆญฃๅธธ็ฐ่ฑกใ
We have provided evaluation scripts to reproduce the performance of our model, details as link.
FAQ
ๅฆ้ๅฐ้ฎ้ข๏ผๆฌ่ฏทๆฅ้ FAQไปฅๅissueๅบ๏ผๅฆไปๆ ๆณ่งฃๅณๅๆไบคissueใ
If you meet problems, please refer to FAQ and the issues first to search a solution before you launch a new issue.
ๅผ็จ (Citation)
ๅฆๆไฝ ่งๅพๆไปฌ็ๅทฅไฝๅฏนไฝ ๆๅธฎๅฉ๏ผๆฌข่ฟๅผ็จ๏ผ
If you find our work helpful, feel free to give us a cite.
@article{qwen,
title={Qwen Technical Report},
author={Jinze Bai and Shuai Bai and Yunfei Chu and Zeyu Cui and Kai Dang and Xiaodong Deng and Yang Fan and Wenbin Ge and Yu Han and Fei Huang and Binyuan Hui and Luo Ji and Mei Li and Junyang Lin and Runji Lin and Dayiheng Liu and Gao Liu and Chengqiang Lu and Keming Lu and Jianxin Ma and Rui Men and Xingzhang Ren and Xuancheng Ren and Chuanqi Tan and Sinan Tan and Jianhong Tu and Peng Wang and Shijie Wang and Wei Wang and Shengguang Wu and Benfeng Xu and Jin Xu and An Yang and Hao Yang and Jian Yang and Shusheng Yang and Yang Yao and Bowen Yu and Hongyi Yuan and Zheng Yuan and Jianwei Zhang and Xingxuan Zhang and Yichang Zhang and Zhenru Zhang and Chang Zhou and Jingren Zhou and Xiaohuan Zhou and Tianhang Zhu},
journal={arXiv preprint arXiv:2309.16609},
year={2023}
}
ไฝฟ็จๅ่ฎฎ๏ผLicense Agreement๏ผ
ๆไปฌ็ไปฃ็ ๅๆจกๅๆ้ๅฏนๅญฆๆฏ็ ็ฉถๅฎๅ จๅผๆพ๏ผๅนถๆฏๆๅ็จใ่ฏทๆฅ็LICENSEไบ่งฃๅ ทไฝ็ๅผๆบๅ่ฎฎ็ป่ใๅฆ้ๅ็จ๏ผ่ฏทๅกซๅ้ฎๅท็ณ่ฏทใ
Our code and checkpoints are open to research purpose, and they are allowed for commercial purposes. Check LICENSE for more details about the license. If you have requirements for commercial use, please fill out the form to apply.
่็ณปๆไปฌ๏ผContact Us๏ผ
ๅฆๆไฝ ๆณ็ปๆไปฌ็็ ๅๅข้ๅไบงๅๅข้็่จ๏ผๆฌข่ฟๅ ๅ ฅๆไปฌ็ๅพฎไฟก็พคใ้้็พคไปฅๅDiscord๏ผๅๆถ๏ผไนๆฌข่ฟ้่ฟ้ฎไปถ๏ผqianwen_opensource@alibabacloud.com๏ผ่็ณปๆไปฌใ
If you are interested to leave a message to either our research team or product team, join our Discord or WeChat groups! Also, feel free to send an email to qianwen_opensource@alibabacloud.com.
- Downloads last month
- 20,639