Update README.md
Browse files
README.md
CHANGED
|
@@ -21,12 +21,12 @@ tags:
|
|
| 21 |
<p>
|
| 22 |
|
| 23 |
<p align="center">
|
| 24 |
-
Kimi-Audio-7B
|
| 25 |
</p>
|
| 26 |
|
| 27 |
## Introduction
|
| 28 |
|
| 29 |
-
We present Kimi-Audio, an open-source audio foundation model excelling in **audio understanding, generation, and conversation**. This repository hosts the model checkpoints for Kimi-Audio-7B
|
| 30 |
|
| 31 |
Kimi-Audio is designed as a universal audio foundation model capable of handling a wide variety of audio processing tasks within a single unified framework. Key features include:
|
| 32 |
|
|
@@ -41,14 +41,24 @@ For more details, please refer to our [GitHub Repository](https://github.com/Moo
|
|
| 41 |
|
| 42 |
## Requirements
|
| 43 |
|
| 44 |
-
|
| 45 |
-
|
| 46 |
```bash
|
| 47 |
git clone https://github.com/MoonshotAI/Kimi-Audio
|
| 48 |
cd Kimi-Audio
|
| 49 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
```
|
| 51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
## Quickstart
|
| 53 |
|
| 54 |
This example demonstrates basic usage for generating text from audio (ASR) and generating both text and speech in a conversational turn using the `Kimi-Audio-7B-Instruct` model.
|
|
|
|
| 21 |
<p>
|
| 22 |
|
| 23 |
<p align="center">
|
| 24 |
+
Kimi-Audio-7B-Instruct <a href="https://huggingface.co/moonshotai/Kimi-Audio-7B-Instruct">🤗</a> | 📑 <a href="https://raw.githubusercontent.com/MoonshotAI/Kimi-Audio/main/assets/kimia_report.pdf">Paper</a>
|
| 25 |
</p>
|
| 26 |
|
| 27 |
## Introduction
|
| 28 |
|
| 29 |
+
We present Kimi-Audio, an open-source audio foundation model excelling in **audio understanding, generation, and conversation**. This repository hosts the model checkpoints for Kimi-Audio-7B-Instruct.
|
| 30 |
|
| 31 |
Kimi-Audio is designed as a universal audio foundation model capable of handling a wide variety of audio processing tasks within a single unified framework. Key features include:
|
| 32 |
|
|
|
|
| 41 |
|
| 42 |
## Requirements
|
| 43 |
|
| 44 |
+
We recommend that you build a Docker image to run the inference. After cloning the inference code, you can construct the image using the `docker build` command.
|
|
|
|
| 45 |
```bash
|
| 46 |
git clone https://github.com/MoonshotAI/Kimi-Audio
|
| 47 |
cd Kimi-Audio
|
| 48 |
+
docker build -t kimi-audio:v0.1 .
|
| 49 |
+
```
|
| 50 |
+
Alternatively, You can also use our pre-built image:
|
| 51 |
+
```bash
|
| 52 |
+
docker pull moonshotai/kimi-audio:v0.1
|
| 53 |
```
|
| 54 |
|
| 55 |
+
Or, you can install requirments by:
|
| 56 |
+
```bash
|
| 57 |
+
pip install -r requirements.txt
|
| 58 |
+
```
|
| 59 |
+
|
| 60 |
+
You may refer to the Dockerfile in case of any environment issues.
|
| 61 |
+
|
| 62 |
## Quickstart
|
| 63 |
|
| 64 |
This example demonstrates basic usage for generating text from audio (ASR) and generating both text and speech in a conversational turn using the `Kimi-Audio-7B-Instruct` model.
|