Update README.md
Browse files
README.md
CHANGED
|
@@ -36,8 +36,8 @@ pip install git+https://github.com/huggingface/transformers accelerate
|
|
| 36 |
We offer a toolkit to help you handle various types of visual input more conveniently, as if you were using an API. This includes base64, URLs, and interleaved images and videos. You can install it using the following command:
|
| 37 |
|
| 38 |
```bash
|
| 39 |
-
# It's highly recommanded to
|
| 40 |
-
pip install keye-vl-utils
|
| 41 |
```
|
| 42 |
|
| 43 |
If you are not using Linux, you might not be able to install `decord` from PyPI. In that case, you can use `pip install keye-vl-utils` which will fall back to using torchvision for video processing. However, you can still [install decord from source](https://github.com/dmlc/decord?tab=readme-ov-file#install-from-source) to get decord used when loading video.
|
|
@@ -435,16 +435,6 @@ The post-training phase of Kwai Keye is meticulously designed into two phases wi
|
|
| 435 |
2. Keye-VL-8B demonstrates exceptional proficiency in video understanding. Across a comprehensive suite of authoritative public video benchmarks, including Video-MME, Video-MMMU, TempCompass, LongVideoBench, and MMVU, the model's performance significantly surpasses that of other top-tier models of a comparable size.
|
| 436 |
3. In evaluation sets that require complex logical reasoning and mathematical problem-solving, such as WeMath, MathVerse, and LogicVista, Kwai Keye-VL-8B displays a strong performance curve. This highlights its advanced capacity for logical deduction and solving complex quantitative problems.
|
| 437 |
|
| 438 |
-
## Requirements
|
| 439 |
-
The code of Kwai Keye-VL has been in the latest Hugging face transformers and we advise you to build from source with command:
|
| 440 |
-
```
|
| 441 |
-
pip install git+https://github.com/huggingface/transformers accelerate
|
| 442 |
-
```
|
| 443 |
-
or you might encounter the following error:
|
| 444 |
-
```
|
| 445 |
-
KeyError: 'Keye-VL'
|
| 446 |
-
```
|
| 447 |
-
|
| 448 |
## ✒️ Citation
|
| 449 |
|
| 450 |
If you find our work helpful for your research, please consider citing our work.
|
|
|
|
| 36 |
We offer a toolkit to help you handle various types of visual input more conveniently, as if you were using an API. This includes base64, URLs, and interleaved images and videos. You can install it using the following command:
|
| 37 |
|
| 38 |
```bash
|
| 39 |
+
# It's highly recommanded to install decord for faster video loading.
|
| 40 |
+
pip install keye-vl-utils decord
|
| 41 |
```
|
| 42 |
|
| 43 |
If you are not using Linux, you might not be able to install `decord` from PyPI. In that case, you can use `pip install keye-vl-utils` which will fall back to using torchvision for video processing. However, you can still [install decord from source](https://github.com/dmlc/decord?tab=readme-ov-file#install-from-source) to get decord used when loading video.
|
|
|
|
| 435 |
2. Keye-VL-8B demonstrates exceptional proficiency in video understanding. Across a comprehensive suite of authoritative public video benchmarks, including Video-MME, Video-MMMU, TempCompass, LongVideoBench, and MMVU, the model's performance significantly surpasses that of other top-tier models of a comparable size.
|
| 436 |
3. In evaluation sets that require complex logical reasoning and mathematical problem-solving, such as WeMath, MathVerse, and LogicVista, Kwai Keye-VL-8B displays a strong performance curve. This highlights its advanced capacity for logical deduction and solving complex quantitative problems.
|
| 437 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 438 |
## ✒️ Citation
|
| 439 |
|
| 440 |
If you find our work helpful for your research, please consider citing our work.
|