1f commited on
Commit
81aa597
·
verified ·
1 Parent(s): 09e811a

Add files using upload-large-folder tool

Browse files
Files changed (20) hide show
  1. r1-a/response_generation/Kimi-Audio/.gitignore +179 -0
  2. r1-a/response_generation/minicpm/MiniCPM-o/eval_mm/vlmevalkit/vlmeval/dataset/dude.py +211 -0
  3. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/router/guard/stateGuard.js +1 -0
  4. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/router/menu/index.js +10 -0
  5. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/base.css +56 -0
  6. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/element/index.less +79 -0
  7. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/element/variable.less +51 -0
  8. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/main.css +30 -0
  9. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/variable.css +7 -0
  10. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/utils/index.js +44 -0
  11. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/utils/websocket.js +91 -0
  12. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/Chatbot.vue +3 -0
  13. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/VideoCall.vue +971 -0
  14. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/VideoCall_0105.vue +955 -0
  15. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/VoiceCall.vue +833 -0
  16. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/VoiceCall_0105.vue +829 -0
  17. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/audioBufferToMp3Base64.js +36 -0
  18. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/merge.js +132 -0
  19. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/mergeMp3Base64.js +29 -0
  20. r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/index.vue +262 -0
r1-a/response_generation/Kimi-Audio/.gitignore ADDED
@@ -0,0 +1,179 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Byte-compiled / optimized / DLL files
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # C extensions
7
+ *.so
8
+
9
+ # Distribution / packaging
10
+ .Python
11
+ build/
12
+ develop-eggs/
13
+ dist/
14
+ downloads/
15
+ eggs/
16
+ .eggs/
17
+ lib/
18
+ lib64/
19
+ parts/
20
+ sdist/
21
+ var/
22
+ wheels/
23
+ share/python-wheels/
24
+ *.egg-info/
25
+ .installed.cfg
26
+ *.egg
27
+ MANIFEST
28
+
29
+ # PyInstaller
30
+ # Usually these files are written by a python_goose script from a template
31
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
32
+ *.manifest
33
+ *.spec
34
+
35
+ # Installer logs
36
+ pip-log.txt
37
+ pip-delete-this-directory.txt
38
+
39
+ # Unit test / coverage reports
40
+ htmlcov/
41
+ .tox/
42
+ .nox/
43
+ .coverage
44
+ .coverage.*
45
+ .cache
46
+ nosetests.xml
47
+ coverage.xml
48
+ *.cover
49
+ *.py,cover
50
+ .hypothesis/
51
+ .pytest_cache/
52
+ cover/
53
+
54
+ # Translations
55
+ *.mo
56
+ *.pot
57
+
58
+ # Django stuff:
59
+ *.log
60
+ local_settings.py
61
+ db.sqlite3
62
+ db.sqlite3-journal
63
+
64
+ # Flask stuff:
65
+ instance/
66
+ .webassets-cache
67
+
68
+ # Scrapy stuff:
69
+ .scrapy
70
+
71
+ # Sphinx documentation
72
+ docs/_build/
73
+
74
+ # PyBuilder
75
+ .pybuilder/
76
+ target/
77
+
78
+ # Jupyter Notebook
79
+ .ipynb_checkpoints
80
+
81
+ # IPython
82
+ profile_default/
83
+ ipython_config.py
84
+
85
+ # pyenv
86
+ # For a library or package, you might want to ignore these files since the code is
87
+ # intended to run in multiple environments; otherwise, check them in:
88
+ # .python_goose-version
89
+
90
+ # pipenv
91
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
93
+ # having no cross-platform support, pipenv may install dependencies that don't work, or not
94
+ # install all needed dependencies.
95
+ #Pipfile.lock
96
+
97
+ # poetry
98
+ # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
99
+ # This is especially recommended for binary packages to ensure reproducibility, and is more
100
+ # commonly ignored for libraries.
101
+ # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102
+ #poetry.lock
103
+
104
+ # pdm
105
+ # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106
+ #pdm.lock
107
+ # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108
+ # in version control.
109
+ # https://pdm.fming.dev/#use-with-ide
110
+ .pdm.toml
111
+
112
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
113
+ __pypackages__/
114
+
115
+ # Celery stuff
116
+ celerybeat-schedule
117
+ celerybeat.pid
118
+
119
+ # SageMath parsed files
120
+ *.sage.py
121
+
122
+ # Environments
123
+ .env
124
+ .env_*
125
+ .venv
126
+ env/
127
+ venv/
128
+ ENV/
129
+ env.bak/
130
+ venv.bak/
131
+
132
+ # Spyder project settings
133
+ .spyderproject
134
+ .spyproject
135
+
136
+ # Rope project settings
137
+ .ropeproject
138
+
139
+ # mkdocs documentation
140
+ /site
141
+
142
+ # mypy
143
+ .mypy_cache/
144
+ .dmypy.json
145
+ dmypy.json
146
+
147
+ # Pyre type checker
148
+ .pyre/
149
+
150
+ # pytype static type analyzer
151
+ .pytype/
152
+
153
+ # Cython debug symbols
154
+ cython_debug/
155
+
156
+ # PyCharm
157
+ # JetBrains specific template is maintained in a separate JetBrains.gitignore that can
158
+ # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
159
+ # and can be added to the global gitignore or merged into this file. For a more nuclear
160
+ # option (not recommended) you can uncomment the following to ignore the entire idea folder.
161
+ .idea/
162
+ data
163
+ logs
164
+ *.zip
165
+ conf
166
+ .DS_Store
167
+ .ruff_cache
168
+ .log
169
+ *.jsonl
170
+ *.parquet
171
+ *.progress
172
+ # Vscode
173
+ .vscode
174
+
175
+ *.safetensors
176
+ *.model
177
+ *.pt
178
+ *.pth
179
+ test_audios/output
r1-a/response_generation/minicpm/MiniCPM-o/eval_mm/vlmevalkit/vlmeval/dataset/dude.py ADDED
@@ -0,0 +1,211 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import math
2
+ from typing import List
3
+
4
+ from .utils.judge_util import build_judge
5
+ from .image_base import ImageBaseDataset
6
+ from .mmlongbench import concat_images, MMLongBench_auxeval, anls_compute
7
+ from ..smp import *
8
+
9
+
10
+ FAIL_MSG = 'Failed to obtain answer via API.'
11
+
12
+
13
+ def DUDE_acc(result_file):
14
+ data = load(result_file)
15
+ overall_score = 0.0
16
+ score_list = list()
17
+ for i in range(len(data)):
18
+ item = data.iloc[i]
19
+ if isinstance(item['answer'], float) and math.isnan(item['answer']):
20
+ item['answer'] = 'Not answerable'
21
+
22
+ item['answer'] = item['answer'].lower()
23
+ item['pred'] = item['pred'].lower()
24
+ score = anls_compute(item['answer'], item['pred'])
25
+ score_list.append(score)
26
+ overall_score += score
27
+
28
+ data['score'] = score_list
29
+ dump(data, result_file)
30
+
31
+ res = dict()
32
+ res['category'], res['num'], res['avg_score'] = ['anls'], [len(data)], [overall_score / len(data)]
33
+ res = pd.DataFrame(res)
34
+ return res
35
+
36
+
37
+ class DUDE(ImageBaseDataset):
38
+
39
+ TYPE = 'VQA'
40
+
41
+ DATASET_URL = {
42
+ 'DUDE': 'https://opencompass.openxlab.space/utils/VLMEval/DUDE.tsv',
43
+ 'DUDE_MINI': 'https://opencompass.openxlab.space/utils/VLMEval/DUDE_MINI.tsv',
44
+ }
45
+ DATASET_MD5 = {
46
+ 'DUDE': '130d860d08206e1e407cd77150c10d88',
47
+ 'DUDE_MINI': 'e0c0d998114f0cca7516d12039d2b538',
48
+ }
49
+
50
+ SUPPORTED_MODELS = {
51
+ 'GPT4': (1, 1),
52
+ 'GPT4V': (1, 1),
53
+ 'GPT4V_HIGH': (1, 1),
54
+ 'GPT4o': (1, 1),
55
+ 'GPT4o_HIGH': (1, 1),
56
+ 'GPT4o_MINI': (1, 1),
57
+ 'XComposer2d5': (1, -1),
58
+ 'XComposer2_4KHD': (1, -1),
59
+ 'MiniCPM-Llama3-V-2_5': (1, 5),
60
+ 'InternVL-Chat-V1-5': (5, 2),
61
+ }
62
+
63
+ def __init__(self, dataset, **kwargs):
64
+ self.model_list = list(self.SUPPORTED_MODELS.keys())
65
+ model_name = kwargs['model']
66
+ if not listinstr(self.model_list, model_name):
67
+ raise AssertionError("{} doesn't support the evaluation on DUDE.".format(model_name))
68
+ super(DUDE, self).__init__(dataset)
69
+
70
+ self.is_api = True if listinstr(['GPT4'], model_name) else False
71
+ self.max_pages = 120
72
+ concat_num, column_num = self.SUPPORTED_MODELS.get(model_name)
73
+ self.concat_num = concat_num
74
+ self.column_num = column_num
75
+
76
+ def prepare_tsv(self, url, file_md5=None):
77
+ data_root = LMUDataRoot()
78
+ os.makedirs(data_root, exist_ok=True)
79
+ file_name = url.split('/')[-1]
80
+ data_path = osp.join(data_root, file_name)
81
+ if osp.exists(data_path) and (file_md5 is None or md5(data_path) == file_md5):
82
+ pass
83
+ else:
84
+ warnings.warn('The dataset tsv is not downloaded')
85
+ download_file(url, data_path)
86
+ return load(data_path)
87
+
88
+ def dump_image(self, origin_line):
89
+ os.makedirs(self.img_root, exist_ok=True)
90
+ try:
91
+ import fitz
92
+ except Exception as e:
93
+ logging.critical(f'{type(e)}: {e}')
94
+ logging.critical('Please use `pip install pymupdf` to parse PDF files.')
95
+
96
+ line = origin_line.copy()
97
+ if not isinstance(line['image_path'], List):
98
+ line['image_path'] = [line['image_path']]
99
+ line['image_path'] = line['image_path'][:self.max_pages]
100
+ skip_pdf_parse = True
101
+ for im_name in line['image_path']:
102
+ path = osp.join(self.img_root, im_name)
103
+ if not read_ok(path):
104
+ skip_pdf_parse = False
105
+ break
106
+
107
+ # Just for being compatible with the zooped loop: zip(line['image'], line['image_path'])
108
+ if skip_pdf_parse:
109
+ line['image'] = line['image_path']
110
+ else:
111
+ pdf_data = base64.b64decode(line['image'])
112
+ pdf_file = io.BytesIO(pdf_data)
113
+ encoded_images = []
114
+ with fitz.open(stream=pdf_file, filetype='pdf') as doc:
115
+ doc = doc[:self.max_pages]
116
+ for page in doc:
117
+ image = page.get_pixmap(dpi=144)
118
+ image_file = io.BytesIO(image.tobytes(output='png'))
119
+ image = Image.open(image_file)
120
+ encoded_image = encode_image_to_base64(image)
121
+ encoded_images.append(encoded_image)
122
+ line['image'] = encoded_images
123
+ print('process {}'.format(line['doc_id']))
124
+
125
+ if 'image' in line:
126
+ if isinstance(line['image'], list):
127
+ tgt_path = []
128
+ assert 'image_path' in line
129
+ for img, im_name in zip(line['image'], line['image_path']):
130
+ path = osp.join(self.img_root, im_name)
131
+ if not read_ok(path):
132
+ decode_base64_to_image_file(img, path)
133
+ tgt_path.append(path)
134
+ else:
135
+ tgt_path = osp.join(self.img_root, f"{line['index']}.jpg")
136
+ if not read_ok(tgt_path):
137
+ decode_base64_to_image_file(line['image'], tgt_path)
138
+ tgt_path = [tgt_path]
139
+ else:
140
+ assert 'image_path' in line
141
+ tgt_path = toliststr(line['image_path'])
142
+
143
+ if self.concat_num > 0 and not self.is_api:
144
+ concatenated_images = concat_images(tgt_path, max_concat=self.concat_num, column_num=self.column_num)
145
+
146
+ old_tgt_path = tgt_path
147
+ assert isinstance(old_tgt_path, list)
148
+ if self.column_num != -1:
149
+ tgt_path = [
150
+ '_'.join(old_tgt_path[0].split('_')[:-1]) + '_concat{}_{}.jpg'.format(self.concat_num, i)
151
+ for i in range(len(concatenated_images))
152
+ ]
153
+ else:
154
+ tgt_path = ['_'.join(old_tgt_path[0].split('_')[:-1]) + '_concat_all.jpg']
155
+
156
+ for path, concatenated_image in zip(tgt_path, concatenated_images):
157
+ if not read_ok(path):
158
+ decode_base64_to_image_file(encode_image_to_base64(concatenated_image), path)
159
+ num_images, image_size = len(old_tgt_path), concatenated_image.size
160
+ print('concat {} images to a new one with size {}. save at {}'.format(num_images, image_size, path))
161
+ return tgt_path
162
+
163
+ @classmethod
164
+ def evaluate(self, eval_file, **judge_kwargs):
165
+ logger = get_logger('Evaluation')
166
+ model = judge_kwargs['model']
167
+
168
+ suffix = eval_file.split('.')[-1]
169
+ storage = eval_file.replace(f'.{suffix}', f'_{model}.xlsx')
170
+ tmp_file = eval_file.replace(f'.{suffix}', f'_{model}.pkl')
171
+
172
+ if osp.exists(storage):
173
+ logger.warning(f'GPT scoring file {storage} already exists, will reuse it in DUDE_eval. ')
174
+ else:
175
+ data = load(eval_file)
176
+ model = build_judge(max_tokens=128, **judge_kwargs)
177
+ lt = len(data)
178
+ lines = [data.iloc[i] for i in range(lt)]
179
+ tups = [(model, line) for line in lines]
180
+ indices = [line['index'] for line in lines]
181
+
182
+ ans = {}
183
+ if osp.exists(tmp_file):
184
+ ans = load(tmp_file)
185
+ tups = [x for x, i in zip(tups, indices) if i not in ans]
186
+ indices = [i for i in indices if i not in ans]
187
+
188
+ if len(indices):
189
+ new_results = list()
190
+ for model, line in tqdm(tups):
191
+ res = MMLongBench_auxeval(model, line)
192
+ new_results.append(res)
193
+
194
+ log_map, res_map, pred_map = {}, {}, {}
195
+ all_inds = [line['index'] for line in lines]
196
+ for k, v in zip(all_inds, new_results):
197
+ log_map[k] = v['log']
198
+ res_map[k] = v['res']
199
+ pred_map[k] = v['pred']
200
+ data['res'] = [res_map[idx] for idx in data['index']]
201
+ data['log'] = [log_map[idx] for idx in data['index']]
202
+ data['pred'] = [pred_map[idx] for idx in data['index']]
203
+ dump(data, storage)
204
+
205
+ score = DUDE_acc(storage)
206
+ score_pth = storage.replace('.xlsx', '_score.csv')
207
+
208
+ dump(score, score_pth)
209
+ logger.info(f'DUDE successfully finished evaluating {eval_file}, results saved in {score_pth}')
210
+ logger.info('Score: ')
211
+ logger.info(score)
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/router/guard/stateGuard.js ADDED
@@ -0,0 +1 @@
 
 
1
+ export function createStateGuard() {}
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/router/menu/index.js ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ export const basicRoutes = [
2
+ {
3
+ path: '/',
4
+ component: () => import('@/views/home/index.vue')
5
+ },
6
+ {
7
+ path: '/:port',
8
+ component: () => import('@/views/home/index.vue')
9
+ }
10
+ ];
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/base.css ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *,
2
+ *::before,
3
+ *::after {
4
+ box-sizing: border-box;
5
+ margin: 0;
6
+ }
7
+
8
+ ::-webkit-scrollbar {
9
+ width: 6px;
10
+ height: 6px;
11
+ }
12
+ ::-webkit-scrollbar-thumb {
13
+ background: #e0e4ee;
14
+ border-radius: 4px;
15
+ }
16
+
17
+ html,
18
+ body {
19
+ width: 100%;
20
+ height: 100%;
21
+ font-family:
22
+ Inter,
23
+ -apple-system,
24
+ BlinkMacSystemFont,
25
+ Segoe UI,
26
+ SF Pro SC,
27
+ SF Pro Display,
28
+ SF Pro Icons,
29
+ PingFang SC,
30
+ Hiragino Sans GB,
31
+ Microsoft YaHei,
32
+ Helvetica Neue,
33
+ Helvetica,
34
+ Arial,
35
+ sans-serif !important;
36
+ background: #f3f3f3;
37
+ transition:
38
+ color 0.5s,
39
+ background-color 0.5s;
40
+ line-height: 1.3;
41
+ font-size: 14px;
42
+ font-weight: 400;
43
+ color: var(--el-text-color-regular);
44
+ text-rendering: optimizeLegibility;
45
+ -webkit-font-smoothing: antialiased;
46
+ -moz-osx-font-smoothing: grayscale;
47
+ margin: 0;
48
+ padding: 0;
49
+ overflow: hidden;
50
+ }
51
+
52
+ #app {
53
+ width: 100%;
54
+ height: 100%;
55
+ padding: 16px 4vw;
56
+ }
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/element/index.less ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ @import url('./variable.less');
2
+
3
+ .el-message {
4
+ box-shadow: 0px 4px 13px 2px rgba(75, 79, 88, 0.11);
5
+ border: none;
6
+ border-radius: 8px;
7
+ top: 60px !important;
8
+ &--success,
9
+ &--error {
10
+ .el-message__content {
11
+ color: rgb(10, 10, 10);
12
+ font-size: 14px;
13
+ }
14
+ }
15
+ .el-message-icon--error {
16
+ color: var(--el-color-danger);
17
+ font-size: 16px;
18
+ }
19
+ .el-message-icon--success {
20
+ color: var(--el-color-success);
21
+ font-size: 16px;
22
+ }
23
+ }
24
+ .el-message.time-warning,
25
+ .el-message.system-error {
26
+ width: calc(100vw - 200px);
27
+ padding: 16px 12px;
28
+ border-radius: 12px;
29
+ }
30
+ .el-message.el-message--warning.time-warning {
31
+ border: 1px solid #f9ac2a;
32
+ background: #fef7ea;
33
+ .el-icon {
34
+ display: none;
35
+ }
36
+ .el-message__content {
37
+ color: #2f333e;
38
+ font-family: PingFang SC;
39
+ font-size: 14px;
40
+ font-style: normal;
41
+ font-weight: 400;
42
+ line-height: normal;
43
+ padding-left: 28px;
44
+ position: relative;
45
+ }
46
+ .el-message__content::before {
47
+ position: absolute;
48
+ content: '';
49
+ width: 20px;
50
+ height: 20px;
51
+ background: url('@/assets/svg/warning.svg') no-repeat;
52
+ left: 0;
53
+ }
54
+ }
55
+ .el-message.el-message--error.system-error {
56
+ border: 1px solid #e72b00;
57
+ background: #ffebe7;
58
+ .el-icon {
59
+ display: none;
60
+ }
61
+ .el-message__content {
62
+ color: #2f333e;
63
+ font-family: PingFang SC;
64
+ font-size: 14px;
65
+ font-style: normal;
66
+ font-weight: 400;
67
+ line-height: normal;
68
+ padding-left: 28px;
69
+ position: relative;
70
+ }
71
+ .el-message__content::before {
72
+ position: absolute;
73
+ content: '';
74
+ width: 20px;
75
+ height: 20px;
76
+ background: url('@/assets/svg/error.svg') no-repeat;
77
+ left: 0;
78
+ }
79
+ }
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/element/variable.less ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ :root {
2
+ --el-component-size-large: 48px;
3
+ --el-component-size: 40px;
4
+ --el-color-primary: #7661ff;
5
+ --el-color-danger: #de0000;
6
+ --el-color-warning: #ff7d00;
7
+ --el-color-success: #00b42a;
8
+ --el-text-color-regular: #0a0a0a;
9
+ }
10
+ .el-button {
11
+ --el-button-height: var(--el-component-size);
12
+ height: var(--el-button-height);
13
+ &--large {
14
+ --el-button-height: var(--el-component-size-large);
15
+ height: var(--el-button-height);
16
+ }
17
+ &--primary {
18
+ --el-button-bg-color: #7661ff;
19
+ --el-button-text-color: var(--el-color-white);
20
+ --el-button-border-color: #7661ff;
21
+ --el-button-hover-bg-color: rgb(159, 144, 255);
22
+ --el-button-hover-text-color: var(--el-color-white);
23
+ --el-button-hover-border-color: rgb(159, 144, 255);
24
+ --el-button-active-bg-color: rgb(98, 82, 208);
25
+ --el-button-active-border-color: rgb(98, 82, 208);
26
+ --el-button-disabled-bg-color: #d4cdff;
27
+ --el-button-disabled-border-color: #d4cdff;
28
+ }
29
+ }
30
+ .el-checkbox {
31
+ --el-checkbox-border-radius: 4px;
32
+ --el-checkbox-input-border: 1px solid rgb(188, 188, 188);
33
+ --el-checkbox-input-border-color-hover: rgb(61, 92, 255);
34
+ --el-checkbox-checked-bg-color: rgb(61, 92, 255);
35
+ --el-checkbox-checked-input-border-color: rgb(61, 92, 255);
36
+ }
37
+ .el-dialog {
38
+ &__header {
39
+ padding-bottom: 20px;
40
+ }
41
+ &__title {
42
+ --el-text-color-primary: rgb(10, 10, 10);
43
+ --el-dialog-title-font-size: 18px;
44
+ --el-dialog-font-line-height: 20px;
45
+ }
46
+ }
47
+ .el-message {
48
+ --el-message-padding: 11px 20px;
49
+ --el-message-bg-color: rgb(255, 255, 255);
50
+ --el-message-text-color: #de0000;
51
+ }
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/main.css ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ @import './base.css';
2
+ @import './variable.css';
3
+
4
+ .layout-root {
5
+ width: 100%;
6
+ height: 100%;
7
+ display: flex;
8
+ flex-direction: column;
9
+ }
10
+
11
+ .layout-main {
12
+ flex: 1 1 0;
13
+ display: flex;
14
+ flex-direction: column;
15
+ padding: 0 var(--layout-main-padding);
16
+ }
17
+
18
+ .layout-footer {
19
+ width: 100%;
20
+ max-width: var(--layout-content-width);
21
+ height: fit-content;
22
+ display: flex;
23
+ flex-direction: column;
24
+ padding: 0 var(--layout-main-padding);
25
+ margin: auto;
26
+ }
27
+
28
+ :focus-visible {
29
+ outline: none;
30
+ }
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/styles/variable.css ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ :root {
2
+ --layout-sidebar-width: 56px;
3
+ --layout-sidebar-left-space: 8px;
4
+ --layout-main-padding: 8px;
5
+ --layout-main-minWidth: calc(var(--layout-content-width) + var(--layout-main-padding) * 2);
6
+ --layout-content-width: 780px;
7
+ }
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/utils/index.js ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ // 判断终端是pc还是移动端
2
+ export const isMobile = () => {
3
+ let flag = /Android|webOS|iPhone|iPad|iPod|BlackBerry|IEMobile|Opera Mini|Linux/i.test(navigator.userAgent);
4
+ const platform = navigator.platform;
5
+ // iPad上的Safari
6
+ if (platform === 'MacIntel' && navigator.maxTouchPoints > 1) {
7
+ flag = true;
8
+ }
9
+ return flag;
10
+ };
11
+ // 单片语音长度(单位:ms)
12
+ const voicePerLength = 200;
13
+
14
+ // 图片计数,算出在哪一次发送语音时,同时发送图片。例如一片语音100ms,一秒钟发送一次语音,即发送的第10片语音时需要带一张图片
15
+ export const maxCount = 1000 / voicePerLength;
16
+
17
+ export const getChunkLength = sampleRate => {
18
+ return sampleRate * (voicePerLength / 1000);
19
+ };
20
+
21
+ export const isAvailablePort = port => {
22
+ return [
23
+ 8000, 8001, 8002, 8003, 8004, 8010, 8011, 8012, 8013, 8014, 8020, 8021, 8022, 8023, 8024, 8025, 8026, 8027,
24
+ 8028, 32449
25
+ ].includes(port);
26
+ };
27
+
28
+ // 文件转base64格式
29
+ export const fileToBase64 = file => {
30
+ return new Promise((resolve, reject) => {
31
+ if (!file) {
32
+ reject('文件不能为空');
33
+ }
34
+ const reader = new FileReader();
35
+ reader.onload = e => {
36
+ const base64String = e.target.result;
37
+ resolve(base64String);
38
+ };
39
+ reader.onerror = () => {
40
+ reject('文件转码失败');
41
+ };
42
+ reader.readAsDataURL(file);
43
+ });
44
+ };
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/utils/websocket.js ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ class WebSocketClient {
2
+ constructor(url, maxReconnectAttempts = 5, reconnectInterval = 5000) {
3
+ this.url = url;
4
+ this.socket = null;
5
+ this.eventHandlers = {};
6
+ this.maxReconnectAttempts = maxReconnectAttempts;
7
+ this.reconnectInterval = reconnectInterval;
8
+ this.reconnectAttempts = 0;
9
+ this.reconnectTimer = null;
10
+ }
11
+
12
+ connect() {
13
+ this.reconnectAttempts = 0;
14
+ this.establishConnection();
15
+ }
16
+
17
+ establishConnection() {
18
+ this.socket = new WebSocket(this.url);
19
+
20
+ this.socket.onopen = () => {
21
+ console.log('WebSocket connection opened');
22
+ this.reconnectAttempts = 0; // Reset reconnect attempts on successful connection
23
+ this.emit('open');
24
+ };
25
+
26
+ this.socket.onclose = event => {
27
+ console.log('WebSocket connection closed', event);
28
+ this.emit('close', event);
29
+ // 1005为主动关闭websocket
30
+ if (event.code !== 1005) {
31
+ this.reconnect();
32
+ }
33
+ };
34
+
35
+ this.socket.onerror = error => {
36
+ console.error('WebSocket error', error);
37
+ this.emit('error', error);
38
+ // Optionally, you may want to trigger a reconnect on error as well
39
+ // this.reconnect();
40
+ };
41
+
42
+ this.socket.onmessage = message => {
43
+ // console.log('WebSocket message received', message.data);
44
+ this.emit('message', message.data);
45
+ };
46
+ }
47
+
48
+ send(data) {
49
+ if (this.socket && this.socket.readyState === WebSocket.OPEN) {
50
+ this.socket.send(data);
51
+ } else {
52
+ console.error('WebSocket is not open');
53
+ }
54
+ }
55
+
56
+ on(event, handler) {
57
+ // if (!this.eventHandlers[event]) {
58
+ this.eventHandlers[event] = [];
59
+ // }
60
+ this.eventHandlers[event].push(handler);
61
+ // console.log('Event handler added:', this.eventHandlers, event);
62
+ }
63
+
64
+ emit(event, ...args) {
65
+ if (this.eventHandlers[event]) {
66
+ this.eventHandlers[event].forEach(handler => handler(...args));
67
+ }
68
+ }
69
+
70
+ close() {
71
+ if (this.socket) {
72
+ this.socket.close();
73
+ }
74
+ clearTimeout(this.reconnectTimer);
75
+ }
76
+
77
+ reconnect() {
78
+ if (this.reconnectAttempts < this.maxReconnectAttempts) {
79
+ console.log(`Reconnecting attempt ${this.reconnectAttempts + 1}/${this.maxReconnectAttempts}...`);
80
+ this.reconnectTimer = setTimeout(() => {
81
+ this.reconnectAttempts++;
82
+ this.establishConnection();
83
+ }, this.reconnectInterval);
84
+ } else {
85
+ console.error('Max reconnect attempts reached. WebSocket will not attempt to reconnect.');
86
+ this.emit('max-reconnect-attempts');
87
+ }
88
+ }
89
+ }
90
+
91
+ export default WebSocketClient;
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/Chatbot.vue ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ <template>
2
+ <div>Chatbot</div>
3
+ </template>
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/VideoCall.vue ADDED
@@ -0,0 +1,971 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <template>
2
+ <!-- <ExtraInfo webVersion="非websocket_0111" :modelVersion="modelVersion" /> -->
3
+ <div class="video-page">
4
+ <div class="video-page-header">
5
+ <div class="voice-container" v-if="!isCalling">
6
+ <SvgIcon name="voice" class="voice-icon" />
7
+ <SvgIcon name="voice" class="voice-icon" />
8
+ <SvgIcon name="voice" class="voice-icon" />
9
+ </div>
10
+ <div class="voice-container" v-else>
11
+ <Voice
12
+ :dataArray="dataArray"
13
+ :isCalling="isCalling"
14
+ :isPlaying="playing"
15
+ :configList="videoConfigList"
16
+ :boxStyle="{ height: '45px' }"
17
+ :itemStyle="{ width: '3px', margin: '0 1px' }"
18
+ />
19
+ </div>
20
+ <!-- <SelectTimbre v-model:timbre="timbre" v-model:audioData="audioData" v-model:disabled="isCalling" /> -->
21
+ </div>
22
+ <div class="video-page-content">
23
+ <div class="video-page-content-video" v-loading="loading" element-loading-background="#f3f3f3">
24
+ <video ref="videoRef" autoplay playsinline muted />
25
+ <canvas ref="canvasRef" canvas-id="canvasId" style="display: none" />
26
+ <div class="switch-camera" v-if="isMobile()" @click="switchCamera">
27
+ <SvgIcon name="switch-camera" class="icon" />
28
+ </div>
29
+ </div>
30
+ <div class="video-page-content-right">
31
+ <div class="output-content">
32
+ <ModelOutput
33
+ v-if="outputData.length > 0"
34
+ :outputData="outputData"
35
+ containerClass="output-content"
36
+ />
37
+ </div>
38
+ <div class="skip-box">
39
+ <!-- <DelayTips
40
+ v-if="delayTimestamp > 200 || delayCount > 2"
41
+ :delayTimestamp="delayTimestamp"
42
+ :delayCount="delayCount"
43
+ /> -->
44
+ <LikeAndDislike v-model:feedbackStatus="feedbackStatus" v-model:curResponseId="curResponseId" />
45
+ <SkipBtn :disabled="skipDisabled" @click="skipVoice" />
46
+ </div>
47
+ </div>
48
+ </div>
49
+ <div class="video-page-btn">
50
+ <el-button v-show="!isCalling" type="success" :disabled="callDisabled" @click="initRecording">
51
+ {{ callDisabled ? t('notReadyBtn') : t('videoCallBtn') }}
52
+ </el-button>
53
+ <el-button v-show="isCalling" @click="stopRecording" type="danger">
54
+ <SvgIcon name="phone-icon" className="phone-icon" />
55
+ <span class="btn-text">{{ t('hangUpBtn') }}</span>
56
+ <CountDown v-model="isCalling" @timeUp="stopRecording" />
57
+ </el-button>
58
+ </div>
59
+ <IdeasList v-if="showIdeasList" :ideasList="videoIdeasList" />
60
+ </div>
61
+ </template>
62
+ <script setup>
63
+ import { sendMessage, stopMessage, uploadConfig } from '@/apis';
64
+ import { encodeWAV } from '@/hooks/useVoice';
65
+ import { getNewUserId, setNewUserId } from '@/hooks/useRandomId';
66
+ import { fetchEventSource } from '@microsoft/fetch-event-source';
67
+ import { MicVAD } from '@ricky0123/vad-web';
68
+ import { videoIdeasList, videoConfigList, showIdeasList } from '@/enums';
69
+ import { isMobile, maxCount, getChunkLength } from '@/utils';
70
+ import { mergeBase64ToBlob } from './merge';
71
+ import { useI18n } from 'vue-i18n';
72
+
73
+ const { t } = useI18n();
74
+ import WebSocketService from '@/utils/websocket';
75
+
76
+ let ctrl = new AbortController();
77
+ let socket = null;
78
+ const audioData = ref({
79
+ base64Str: '',
80
+ type: 'mp3'
81
+ }); // 自定义音色base64
82
+ const isCalling = defineModel();
83
+ const videoRef = ref();
84
+ const videoStream = ref(null);
85
+ const interval = ref();
86
+ const canvasRef = ref();
87
+ const videoImage = ref([]);
88
+ const videoLoaded = ref(false);
89
+ const taskQueue = ref([]);
90
+ const running = ref(false);
91
+ const outputData = ref([]);
92
+ const isFirstReturn = ref(true);
93
+ const audioPlayQueue = ref([]);
94
+ const base64List = ref([]);
95
+ const playing = ref(false);
96
+ const timbre = ref([1]);
97
+ const isReturnError = ref(false);
98
+
99
+ const textQueue = ref('');
100
+ const textAnimationInterval = ref();
101
+
102
+ const analyser = ref();
103
+ const dataArray = ref();
104
+ const animationFrameId = ref();
105
+ const skipDisabled = ref(true);
106
+ const stop = ref(false);
107
+ const isFrontCamera = ref(true);
108
+ const loading = ref(false);
109
+
110
+ const isEnd = ref(false); // sse接口关闭,认为模型已完成本次返回
111
+
112
+ const isFirstPiece = ref(true);
113
+ const allVoice = ref([]);
114
+ const callDisabled = ref(true);
115
+
116
+ const feedbackStatus = ref('');
117
+ const curResponseId = ref('');
118
+ const delayTimestamp = ref(0); // 当前发送片延时
119
+ const delayCount = ref(0); // 当前���余多少ms未发送到接口
120
+
121
+ const modelVersion = ref('');
122
+
123
+ let mediaStream;
124
+ let audioRecorder;
125
+ let audioStream;
126
+ let intervalId;
127
+ let audioContext;
128
+ let audioChunks = [];
129
+ let count = 0;
130
+ let audioDOM;
131
+
132
+ onBeforeUnmount(() => {
133
+ stopRecording();
134
+ });
135
+ const vadStartTime = ref();
136
+ let myvad = null;
137
+ let vadTimer = null; // vad定时器,用于检测1s内人声是否停止,1s内停止,可认为是vad误触,直接忽略,1s内未停止,则认为是人声,已自动跳过当前对话
138
+ const vadStart = async () => {
139
+ myvad = await MicVAD.new({
140
+ onSpeechStart: () => {
141
+ console.log('Speech start', +new Date());
142
+ // if (!skipDisabled.value) {
143
+ vadTimer && clearTimeout(vadTimer);
144
+ vadTimer = setTimeout(() => {
145
+ // vadStartTime.value = +new Date();
146
+ console.log('打断时间: ', +new Date());
147
+ skipVoice();
148
+ }, 500);
149
+ // }
150
+ },
151
+ onSpeechEnd: audio => {
152
+ vadTimer && clearTimeout(vadTimer);
153
+ console.log('Speech end', +new Date());
154
+ // debugger;
155
+ // do something with `audio` (Float32Array of audio samples at sample rate 16000)...
156
+ },
157
+ baseAssetPath: '/'
158
+ });
159
+ myvad.start();
160
+ };
161
+ onMounted(async () => {
162
+ const { code, message } = await stopMessage();
163
+ if (code !== 0) {
164
+ ElMessage({
165
+ type: 'error',
166
+ message: message,
167
+ duration: 3000,
168
+ customClass: 'system-error'
169
+ });
170
+ return;
171
+ }
172
+ callDisabled.value = false;
173
+ });
174
+ const delay = ms => {
175
+ return new Promise(resolve => setTimeout(resolve, ms));
176
+ };
177
+ const initRecording = async () => {
178
+ uploadUserConfig()
179
+ .then(async () => {
180
+ if (!audioDOM) {
181
+ audioDOM = new Audio();
182
+ audioDOM.playsinline = true;
183
+ audioDOM.preload = 'auto';
184
+ }
185
+ // 每次call都需要生成新uid
186
+ setNewUserId();
187
+ buildConnect();
188
+ await delay(100);
189
+ // if (socket) {
190
+ // socket.close();
191
+ // }
192
+ // socket = new WebSocketService(
193
+ // `/ws/stream${window.location.search}&uid=${getNewUserId()}&service=minicpmo-server`
194
+ // );
195
+ // socket.connect();
196
+
197
+ initVideoStream('environment');
198
+ if (localStorage.getItem('canStopByVoice') === 'true') {
199
+ console.log('vad start');
200
+ vadStart();
201
+ }
202
+ })
203
+ .catch(() => {});
204
+ };
205
+ // 切换摄像头
206
+ const switchCamera = () => {
207
+ if (!isCalling.value) {
208
+ return;
209
+ }
210
+ isFrontCamera.value = !isFrontCamera.value;
211
+ const facingMode = isFrontCamera.value ? 'environment' : 'user'; // 'user' 前置, 'environment' 后置
212
+ initVideoStream(facingMode);
213
+ };
214
+ const initVideoStream = async facingMode => {
215
+ if (mediaStream) {
216
+ mediaStream.getTracks().forEach(track => track.stop());
217
+ videoStream.value = null;
218
+ }
219
+ outputData.value = [];
220
+ isCalling.value = true;
221
+ loading.value = true;
222
+ if (!videoStream.value) {
223
+ try {
224
+ mediaStream = await window.navigator.mediaDevices.getUserMedia({
225
+ video: { facingMode },
226
+ audio: true
227
+ });
228
+ videoStream.value = mediaStream;
229
+ videoRef.value.srcObject = mediaStream;
230
+ loading.value = false;
231
+ console.log('打开后: ', +new Date());
232
+ // takePhotos();
233
+ audioContext = new (window.AudioContext || window.webkitAudioContext)({ sampleRate: 16000 });
234
+ console.log('samplate: ', audioContext);
235
+ const audioSource = audioContext.createMediaStreamSource(mediaStream);
236
+ interval.value = setInterval(() => dealImage(), 50);
237
+ // 创建 ScriptProcessorNode 用于捕获音频数据
238
+ const processor = audioContext.createScriptProcessor(256, 1, 1);
239
+
240
+ processor.onaudioprocess = event => {
241
+ if (!isCalling.value) return;
242
+ if (isReturnError.value) {
243
+ stopRecording();
244
+ return;
245
+ }
246
+ const data = event.inputBuffer.getChannelData(0);
247
+ audioChunks.push(new Float32Array(data));
248
+ // 检查是否已经收集到1秒钟的数据
249
+ const totalBufferLength = audioChunks.reduce((total, curr) => total + curr.length, 0);
250
+ const chunkLength = getChunkLength(audioContext.sampleRate);
251
+ if (totalBufferLength >= chunkLength) {
252
+ // 合并到一个完整的数据数组,并裁剪成1秒钟
253
+ const mergedBuffer = mergeBuffers(audioChunks, totalBufferLength);
254
+ const oneSecondBuffer = mergedBuffer.slice(0, audioContext.sampleRate);
255
+
256
+ // 保存并处理成WAV格式
257
+ addQueue(+new Date(), () => saveAudioChunk(oneSecondBuffer, +new Date()));
258
+
259
+ // 保留多余的数据备用
260
+ audioChunks = [mergedBuffer.slice(audioContext.sampleRate)];
261
+ }
262
+ };
263
+ analyser.value = audioContext.createAnalyser();
264
+ // 将音频节点连接到分析器
265
+ audioSource.connect(analyser.value);
266
+ // 分析器设置
267
+ analyser.value.fftSize = 256;
268
+ const bufferLength = analyser.value.frequencyBinCount;
269
+ dataArray.value = new Uint8Array(bufferLength);
270
+ // 开始绘制音波
271
+ drawBars();
272
+
273
+ audioSource.connect(processor);
274
+ processor.connect(audioContext.destination);
275
+ } catch {}
276
+ }
277
+ };
278
+ const drawText = async () => {
279
+ if (textQueue.value.length > 0) {
280
+ outputData.value[outputData.value.length - 1].text += textQueue.value[0];
281
+ textQueue.value = textQueue.value.slice(1);
282
+ } else {
283
+ cancelAnimationFrame(textAnimationInterval.value);
284
+ }
285
+ textAnimationInterval.value = requestAnimationFrame(drawText);
286
+ };
287
+ const getStopValue = () => {
288
+ return stop.value;
289
+ };
290
+ const getPlayingValue = () => {
291
+ return playing.value;
292
+ };
293
+ const getStopStatus = () => {
294
+ return localStorage.getItem('canStopByVoice') === 'true';
295
+ };
296
+ const saveAudioChunk = (buffer, timestamp) => {
297
+ return new Promise(resolve => {
298
+ if (!getStopStatus() && getPlayingValue()) {
299
+ resolve();
300
+ return;
301
+ }
302
+ const wavBlob = encodeWAV(buffer, audioContext.sampleRate);
303
+ let reader = new FileReader();
304
+ reader.readAsDataURL(wavBlob);
305
+
306
+ reader.onloadend = async function () {
307
+ let base64data = reader.result.split(',')[1];
308
+ const imgBase64 = videoImage.value[videoImage.value.length - 1]?.src;
309
+ if (!(base64data && imgBase64)) {
310
+ resolve();
311
+ return;
312
+ }
313
+ const strBase64 = imgBase64.split(',')[1];
314
+ count++;
315
+ let obj = {
316
+ messages: [
317
+ {
318
+ role: 'user',
319
+ content: [
320
+ {
321
+ type: 'input_audio',
322
+ input_audio: {
323
+ data: base64data,
324
+ format: 'wav',
325
+ timestamp: String(timestamp)
326
+ }
327
+ }
328
+ ]
329
+ }
330
+ ]
331
+ };
332
+ obj.messages[0].content.unshift({
333
+ type: 'image_data',
334
+ image_data: {
335
+ data: count === maxCount ? strBase64 : '',
336
+ type: 2
337
+ }
338
+ });
339
+ if (count === maxCount) {
340
+ count = 0;
341
+ }
342
+ // socket.send(JSON.stringify(obj));
343
+ // socket.on('message', data => {
344
+ // console.log('message: ', data);
345
+ // delayTimestamp.value = +new Date() - timestamp;
346
+ // delayCount.value = taskQueue.value.length;
347
+ // resolve();
348
+ // });
349
+ // 将Base64音频数据发送到后端
350
+ try {
351
+ await sendMessage(obj);
352
+ delayTimestamp.value = +new Date() - timestamp;
353
+ delayCount.value = taskQueue.value.length;
354
+ } catch (err) {}
355
+ resolve();
356
+ };
357
+ });
358
+ };
359
+ const mergeBuffers = (buffers, length) => {
360
+ const result = new Float32Array(length);
361
+ let offset = 0;
362
+ for (let buffer of buffers) {
363
+ result.set(buffer, offset);
364
+ offset += buffer.length;
365
+ }
366
+ return result;
367
+ };
368
+ const stopRecording = () => {
369
+ isCalling.value = false;
370
+ clearInterval(interval.value);
371
+ interval.value = null;
372
+ if (audioRecorder && audioRecorder.state !== 'inactive') {
373
+ audioRecorder.stop();
374
+ }
375
+ if (animationFrameId.value) {
376
+ cancelAnimationFrame(animationFrameId.value);
377
+ }
378
+ if (audioContext && audioContext.state !== 'closed') {
379
+ audioContext.close();
380
+ }
381
+ destroyVideoStream();
382
+ taskQueue.value = [];
383
+ audioPlayQueue.value = [];
384
+ base64List.value = [];
385
+ ctrl.abort();
386
+ ctrl = new AbortController();
387
+ isReturnError.value = false;
388
+ skipDisabled.value = true;
389
+ playing.value = false;
390
+ audioDOM?.pause();
391
+ stopMessage();
392
+ if (socket) {
393
+ socket.close();
394
+ }
395
+ if (
396
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
397
+ outputData.value[outputData.value.length - 1].audio === '' &&
398
+ allVoice.value.length > 0
399
+ ) {
400
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
401
+ }
402
+ myvad && myvad.destroy();
403
+ };
404
+ // 建立连接
405
+ const buildConnect = () => {
406
+ const obj = {
407
+ messages: [
408
+ {
409
+ role: 'user',
410
+ content: [{ type: 'none' }]
411
+ }
412
+ ],
413
+ stream: true
414
+ };
415
+ isEnd.value = false;
416
+ ctrl.abort();
417
+ ctrl = new AbortController();
418
+ const url = `/api/v1/completions${window.location.search}`;
419
+
420
+ fetchEventSource(url, {
421
+ method: 'POST',
422
+ headers: {
423
+ 'Content-Type': 'application/json',
424
+ service: 'minicpmo-server',
425
+ uid: getNewUserId()
426
+ },
427
+ body: JSON.stringify(obj),
428
+ signal: ctrl.signal,
429
+ openWhenHidden: true,
430
+ async onopen(response) {
431
+ isFirstPiece.value = true;
432
+ isFirstReturn.value = true;
433
+ allVoice.value = [];
434
+ base64List.value = [];
435
+ console.log('onopen', response);
436
+ if (response.status !== 200) {
437
+ ElMessage({
438
+ type: 'error',
439
+ message: 'At limit. Please try again soon.',
440
+ duration: 3000,
441
+ customClass: 'system-error'
442
+ });
443
+ isReturnError.value = true;
444
+ } else {
445
+ isReturnError.value = false;
446
+ drawText();
447
+ }
448
+ },
449
+ onmessage(msg) {
450
+ const data = JSON.parse(msg.data);
451
+ if (data.response_id) {
452
+ curResponseId.value = data.response_id;
453
+ }
454
+ if (data.choices[0]?.text) {
455
+ textQueue.value += data.choices[0].text.replace('<end>', '');
456
+ console.warn('text return time -------------------------------', +new Date());
457
+ }
458
+ // 首次返回的是前端发给后端的音频片段,需要单独处理
459
+ if (isFirstReturn.value) {
460
+ console.log('第一次');
461
+ isFirstReturn.value = false;
462
+ // 如果后端返回的音频为空,需要重连
463
+ if (!data.choices[0].audio) {
464
+ buildConnect();
465
+ return;
466
+ }
467
+ outputData.value.push({
468
+ type: 'USER',
469
+ audio: `data:audio/wav;base64,${data.choices[0].audio}`
470
+ });
471
+ outputData.value.push({
472
+ type: 'BOT',
473
+ text: '',
474
+ audio: ''
475
+ });
476
+ return;
477
+ }
478
+ if (data.choices[0]?.audio) {
479
+ console.log('audio return time -------------------------------', +new Date());
480
+ if (!getStopValue() && isCalling.value) {
481
+ skipDisabled.value = false;
482
+ base64List.value.push(`data:audio/wav;base64,${data.choices[0].audio}`);
483
+ addAudioQueue(() => truePlay(data.choices[0].audio));
484
+ }
485
+ allVoice.value.push(`data:audio/wav;base64,${data.choices[0].audio}`);
486
+ } else {
487
+ // 发生异常了,直接重连
488
+ buildConnect();
489
+ }
490
+ if (data.choices[0].text.includes('<end>')) {
491
+ // isEnd.value = true;
492
+ console.log('收到结束标记了:', +new Date());
493
+ if (
494
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
495
+ outputData.value[outputData.value.length - 1].audio === '' &&
496
+ allVoice.value.length > 0
497
+ ) {
498
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
499
+ }
500
+ }
501
+ },
502
+ onclose() {
503
+ console.log('onclose', +new Date());
504
+ isEnd.value = true;
505
+ if (
506
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
507
+ outputData.value[outputData.value.length - 1].audio === '' &&
508
+ allVoice.value.length > 0
509
+ ) {
510
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
511
+ }
512
+ // sse关闭后,如果待播放的音频列表为空,说明模型出错了,此次连接没有返回音频,则直接重连
513
+ vadStartTime.value = +new Date();
514
+ if (audioPlayQueue.value.length === 0) {
515
+ let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
516
+ if (startIndex !== -1) {
517
+ taskQueue.value = taskQueue.value.slice(startIndex);
518
+ }
519
+ buildConnect();
520
+ }
521
+ },
522
+ onerror(err) {
523
+ console.log('onerror', err);
524
+ ctrl.abort();
525
+ ctrl = new AbortController();
526
+ throw err;
527
+ }
528
+ });
529
+ };
530
+ // 返回的语音放到队列里,挨个播放
531
+ const addAudioQueue = async item => {
532
+ audioPlayQueue.value.push(item);
533
+ if (isFirstPiece.value) {
534
+ await delay(1500);
535
+ isFirstPiece.value = false;
536
+ }
537
+ if (audioPlayQueue.value.length > 0 && !playing.value) {
538
+ playing.value = true;
539
+ playAudio();
540
+ }
541
+ };
542
+ // 控制播放队列执行
543
+ const playAudio = () => {
544
+ console.log('剩余播放列表:', audioPlayQueue.value, +new Date());
545
+
546
+ if (!isEnd.value && base64List.value.length >= 2) {
547
+ const remainLen = base64List.value.length;
548
+ const blob = mergeBase64ToBlob(base64List.value);
549
+ audioDOM.src = blob;
550
+ audioDOM.play();
551
+ console.error('前期合并后播放开始时间: ', +new Date());
552
+ audioDOM.onended = () => {
553
+ console.error('前期合并后播放结束时间: ', +new Date());
554
+ base64List.value = base64List.value.slice(remainLen);
555
+ audioPlayQueue.value = audioPlayQueue.value.slice(remainLen);
556
+ playAudio();
557
+ };
558
+ return;
559
+ }
560
+ if (isEnd.value && base64List.value.length >= 2) {
561
+ const blob = mergeBase64ToBlob(base64List.value);
562
+ audioDOM.src = blob;
563
+ audioDOM.play();
564
+ console.error('合并后播放开始时间: ', +new Date());
565
+ audioDOM.onended = () => {
566
+ console.error('合并后播放结束时间: ', +new Date());
567
+ // URL.revokeObjectURL(url);
568
+ base64List.value = [];
569
+ audioPlayQueue.value = [];
570
+ playing.value = false;
571
+ skipDisabled.value = true;
572
+ if (isCalling.value && !isReturnError.value) {
573
+ // skipDisabled.value = true;
574
+ taskQueue.value = [];
575
+ // 打断前记录一下打断时间或vad触发事件
576
+ // vadStartTime.value = +new Date();
577
+ // // 每次完成后只保留当前时刻往前推1s的语音
578
+ // console.log(
579
+ // '截取前长度:',
580
+ // taskQueue.value.map(item => item.time)
581
+ // );
582
+ // let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
583
+ // if (startIndex !== -1) {
584
+ // taskQueue.value = taskQueue.value.slice(startIndex);
585
+ // console.log(
586
+ // '截取后长度:',
587
+ // taskQueue.value.map(item => item.time),
588
+ // vadStartTime.value
589
+ // );
590
+ // }
591
+ buildConnect();
592
+ }
593
+ };
594
+ return;
595
+ }
596
+ base64List.value.shift();
597
+ const _truePlay = audioPlayQueue.value.shift();
598
+ if (_truePlay) {
599
+ _truePlay().finally(() => {
600
+ playAudio();
601
+ });
602
+ } else {
603
+ playing.value = false;
604
+ if (isEnd.value) {
605
+ console.warn('play done................');
606
+ skipDisabled.value = true;
607
+ }
608
+ // 播放完成后且正在通话且接口未返回错误时开始下一次连接
609
+ if (isEnd.value && isCalling.value && !isReturnError.value) {
610
+ // skipDisabled.value = true;
611
+ taskQueue.value = [];
612
+ // 跳��之后,只保留当前时间点两秒内到之后的音频片段
613
+ // vadStartTime.value = +new Date();
614
+ // console.log(
615
+ // '截取前长度:',
616
+ // taskQueue.value.map(item => item.time)
617
+ // );
618
+ // let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
619
+ // if (startIndex !== -1) {
620
+ // taskQueue.value = taskQueue.value.slice(startIndex);
621
+ // console.log(
622
+ // '截取后长度:',
623
+ // taskQueue.value.map(item => item.time),
624
+ // vadStartTime.value
625
+ // );
626
+ // }
627
+ buildConnect();
628
+ }
629
+ }
630
+ };
631
+ // 播放音频
632
+ const truePlay = voice => {
633
+ console.log('promise: ', +new Date());
634
+ return new Promise(resolve => {
635
+ audioDOM.src = 'data:audio/wav;base64,' + voice;
636
+ console.error('播放开始时间:', +new Date());
637
+ audioDOM
638
+ .play()
639
+ .then(() => {
640
+ console.log('Audio played successfully');
641
+ })
642
+ .catch(error => {
643
+ if (error.name === 'NotAllowedError' || error.name === 'SecurityError') {
644
+ console.error('User interaction required or permission issue:', error);
645
+ // ElMessage.warning('音频播放失败');
646
+ console.error('播放失败时间');
647
+ // alert('Please interact with the page (like clicking a button) to enable audio playback.');
648
+ } else {
649
+ console.error('Error playing audio:', error);
650
+ }
651
+ });
652
+ // .finally(() => {
653
+ // resolve();
654
+ // });
655
+ audioDOM.onerror = () => {
656
+ console.error('播放失败时间', +new Date());
657
+ resolve();
658
+ };
659
+ audioDOM.onended = () => {
660
+ console.error('播放结束时间: ', +new Date());
661
+ // URL.revokeObjectURL(url);
662
+ resolve();
663
+ };
664
+ });
665
+ };
666
+ // 当队列中任务数大于0时,开始处理队列中的任务
667
+ const addQueue = (time, item) => {
668
+ taskQueue.value.push({ func: item, time });
669
+ if (taskQueue.value.length > 0 && !running.value) {
670
+ running.value = true;
671
+ processQueue();
672
+ }
673
+ };
674
+ const processQueue = () => {
675
+ const item = taskQueue.value.shift();
676
+ if (item?.func) {
677
+ item.func()
678
+ .then(res => {
679
+ console.log('已处理事件: ', res);
680
+ })
681
+ .finally(() => processQueue());
682
+ } else {
683
+ running.value = false;
684
+ }
685
+ };
686
+ const destroyVideoStream = () => {
687
+ videoStream.value?.getTracks().forEach(track => track.stop());
688
+ videoStream.value = null;
689
+ // 将srcObject设置为null以切断与MediaStream 对象的链接,以便将其释放
690
+ videoRef.value.srcObject = null;
691
+
692
+ videoImage.value = [];
693
+ videoLoaded.value = false;
694
+
695
+ clearInterval(intervalId);
696
+ clearInterval(interval.value);
697
+ interval.value = null;
698
+ };
699
+ const dealImage = () => {
700
+ if (!videoRef.value) {
701
+ return;
702
+ }
703
+ const canvas = canvasRef.value;
704
+ canvasRef.value.width = videoRef.value.videoWidth;
705
+ canvasRef.value.height = videoRef.value.videoHeight;
706
+ const context = canvas.getContext('2d');
707
+ context.drawImage(videoRef.value, 0, 0, canvasRef.value.width, canvasRef.value.height);
708
+ const imageDataUrl = canvas.toDataURL('image/webp', 0.8);
709
+
710
+ videoImage.value.push({ src: imageDataUrl });
711
+ };
712
+ const drawBars = () => {
713
+ // AnalyserNode接口的 getByteFrequencyData() 方法将当前频率数据复制到传入的 Uint8Array(无符号字节数组)中。
714
+ analyser.value.getByteFrequencyData(dataArray.value);
715
+ animationFrameId.value = requestAnimationFrame(drawBars);
716
+ };
717
+ // 跳过当前片段
718
+ const skipVoice = async () => {
719
+ // 打断前记录一下打断时间或vad触发事件
720
+ vadStartTime.value = +new Date();
721
+ if (!skipDisabled.value) {
722
+ if (
723
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
724
+ outputData.value[outputData.value.length - 1].audio === ''
725
+ ) {
726
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
727
+ }
728
+ base64List.value = [];
729
+ audioPlayQueue.value = [];
730
+ // 跳过之后,只保留当前时间点两秒内到之后的音频片段
731
+ console.log(
732
+ '截取前长度:',
733
+ taskQueue.value.map(item => item.time)
734
+ );
735
+ let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
736
+ if (startIndex !== -1) {
737
+ taskQueue.value = taskQueue.value.slice(startIndex);
738
+ console.log(
739
+ '截取后长度:',
740
+ taskQueue.value.map(item => item.time),
741
+ vadStartTime.value
742
+ );
743
+ }
744
+ stop.value = true;
745
+ audioDOM?.pause();
746
+ setTimeout(() => {
747
+ skipDisabled.value = true;
748
+ }, 300);
749
+ try {
750
+ playing.value = false;
751
+ await stopMessage();
752
+ stop.value = false;
753
+ // playing.value = false;
754
+ buildConnect();
755
+ // cancelAnimationFrame(animationFrameId.value);
756
+ } catch (err) {}
757
+ }
758
+ };
759
+ // 每次call先上传当前用户配置
760
+ const uploadUserConfig = async () => {
761
+ if (!localStorage.getItem('configData')) {
762
+ return new Promise(resolve => resolve());
763
+ }
764
+ const {
765
+ videoQuality,
766
+ useAudioPrompt,
767
+ voiceClonePrompt,
768
+ assistantPrompt,
769
+ vadThreshold,
770
+ audioFormat,
771
+ base64Str
772
+ } = JSON.parse(localStorage.getItem('configData'));
773
+ const obj = {
774
+ messages: [
775
+ {
776
+ role: 'user',
777
+ content: [
778
+ {
779
+ type: 'input_audio',
780
+ input_audio: {
781
+ data: base64Str,
782
+ format: audioFormat
783
+ }
784
+ },
785
+ {
786
+ type: 'options',
787
+ options: {
788
+ hd_video: videoQuality,
789
+ use_audio_prompt: useAudioPrompt,
790
+ vad_threshold: vadThreshold,
791
+ voice_clone_prompt: voiceClonePrompt,
792
+ assistant_prompt: assistantPrompt
793
+ }
794
+ }
795
+ ]
796
+ }
797
+ ]
798
+ };
799
+ const { code, message, data } = await uploadConfig(obj);
800
+ modelVersion.value = data?.choices?.content || '';
801
+ return new Promise((resolve, reject) => {
802
+ if (code !== 0) {
803
+ ElMessage({
804
+ type: 'error',
805
+ message: message,
806
+ duration: 3000,
807
+ customClass: 'system-error'
808
+ });
809
+ reject();
810
+ } else {
811
+ resolve();
812
+ }
813
+ });
814
+ };
815
+ </script>
816
+ <style lang="less" scoped>
817
+ .video-page {
818
+ flex: 1;
819
+ height: 100%;
820
+ display: flex;
821
+ flex-direction: column;
822
+ &-header {
823
+ width: 100%;
824
+ display: flex;
825
+ align-items: center;
826
+ justify-content: center;
827
+ padding: 0 16px 16px;
828
+ box-shadow: 0 0.5px 0 0 #e0e0e0;
829
+ margin-bottom: 16px;
830
+ .header-icon {
831
+ display: flex;
832
+ align-items: center;
833
+ img {
834
+ width: 24px;
835
+ height: 24px;
836
+ margin-right: 8px;
837
+ }
838
+ span {
839
+ color: rgba(23, 23, 23, 0.9);
840
+ font-family: PingFang SC;
841
+ font-size: 16px;
842
+ font-style: normal;
843
+ font-weight: 500;
844
+ line-height: normal;
845
+ margin-right: 40px;
846
+ flex-shrink: 0;
847
+ }
848
+ }
849
+ .voice-container {
850
+ display: flex;
851
+ .voice-icon {
852
+ width: 191px;
853
+ height: 45px;
854
+ }
855
+ }
856
+ }
857
+ &-content {
858
+ flex: 1;
859
+ margin-bottom: 16px;
860
+ display: flex;
861
+ height: 0;
862
+ &-video {
863
+ width: 50%;
864
+ height: 100%;
865
+ background: #f3f3f3;
866
+ flex-shrink: 0;
867
+ position: relative;
868
+ video {
869
+ width: 100%;
870
+ height: 100%;
871
+ object-fit: contain;
872
+ }
873
+ .switch-camera {
874
+ position: absolute;
875
+ top: 10px;
876
+ right: 10px;
877
+ width: 36px;
878
+ height: 36px;
879
+ background: #ffffff;
880
+ border-radius: 6px;
881
+ display: flex;
882
+ justify-content: center;
883
+ align-items: center;
884
+ font-size: 24px;
885
+ z-index: 999;
886
+ .icon {
887
+ width: 20px;
888
+ height: 20px;
889
+ }
890
+ }
891
+ }
892
+ &-right {
893
+ margin-left: 16px;
894
+ flex: 1;
895
+ padding: 0 16px;
896
+ display: flex;
897
+ flex-direction: column;
898
+ .output-content {
899
+ flex: 1;
900
+ overflow: auto;
901
+ }
902
+ .skip-box {
903
+ display: flex;
904
+ align-items: center;
905
+ justify-content: flex-end;
906
+ margin-top: 16px;
907
+ }
908
+ }
909
+ }
910
+ &-btn {
911
+ text-align: center;
912
+ padding: 8px 0;
913
+ .el-button {
914
+ width: 284px;
915
+ height: 46px;
916
+ border-radius: 8px;
917
+ }
918
+ .el-button.el-button--success {
919
+ background: #647fff;
920
+ border-color: #647fff;
921
+ &:hover {
922
+ opacity: 0.8;
923
+ }
924
+ span {
925
+ color: #fff;
926
+ font-family: PingFang SC;
927
+ font-size: 16px;
928
+ font-style: normal;
929
+ font-weight: 500;
930
+ line-height: normal;
931
+ }
932
+ }
933
+ .el-button.el-button--success.is-disabled {
934
+ background: #f3f3f3;
935
+ border-color: #f3f3f3;
936
+ span {
937
+ color: #d1d1d1;
938
+ }
939
+ }
940
+ .el-button.el-button--danger {
941
+ border-color: #dc3545;
942
+ background-color: #dc3545;
943
+ color: #ffffff;
944
+ font-family: PingFang SC;
945
+ font-size: 16px;
946
+ font-style: normal;
947
+ font-weight: 500;
948
+ line-height: normal;
949
+ .phone-icon {
950
+ margin-right: 10px;
951
+ }
952
+ .btn-text {
953
+ margin-right: 10px;
954
+ }
955
+ .btn-desc {
956
+ margin-right: 16px;
957
+ }
958
+ }
959
+ }
960
+ }
961
+ .video-size {
962
+ position: absolute;
963
+ bottom: 10px;
964
+ right: 10px;
965
+ background: rgba(0, 0, 0, 0.5);
966
+ color: #fff;
967
+ padding: 4px 8px;
968
+ border-radius: 4px;
969
+ font-size: 12px;
970
+ }
971
+ </style>
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/VideoCall_0105.vue ADDED
@@ -0,0 +1,955 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <template>
2
+ <ExtraInfo webVersion="websocket_0107" :modelVersion="modelVersion" />
3
+ <div class="video-page">
4
+ <div class="video-page-header">
5
+ <div style="display: flex; align-items: center" class="header-icon">
6
+ <img src="@/assets/images/voice-icon.png" />
7
+ <span>Audio Choice</span>
8
+ </div>
9
+ <div class="voice-container" v-if="!isCalling">
10
+ <SvgIcon name="voice" class="voice-icon" />
11
+ <SvgIcon name="voice" class="voice-icon" />
12
+ <SvgIcon name="voice" class="voice-icon" />
13
+ </div>
14
+ <div class="voice-container" v-else>
15
+ <Voice
16
+ :dataArray="dataArray"
17
+ :isCalling="isCalling"
18
+ :isPlaying="playing"
19
+ :configList="videoConfigList"
20
+ :boxStyle="{ height: '45px' }"
21
+ :itemStyle="{ width: '3px', margin: '0 1px' }"
22
+ />
23
+ </div>
24
+ <!-- <SelectTimbre v-model:timbre="timbre" v-model:audioData="audioData" v-model:disabled="isCalling" /> -->
25
+ </div>
26
+ <div class="video-page-content">
27
+ <div class="video-page-content-video" v-loading="loading" element-loading-background="#f3f3f3">
28
+ <video ref="videoRef" autoplay playsinline muted />
29
+ <canvas ref="canvasRef" canvas-id="canvasId" style="display: none" />
30
+ <div class="switch-camera" v-if="isMobile()" @click="switchCamera">
31
+ <SvgIcon name="switch-camera" class="icon" />
32
+ </div>
33
+ <!-- <div class="video-size" v-if="width || height">{{ width }} x {{ height }}</div> -->
34
+ </div>
35
+ <div class="video-page-content-right">
36
+ <div class="output-content">
37
+ <ModelOutput
38
+ v-if="outputData.length > 0"
39
+ :outputData="outputData"
40
+ containerClass="output-content"
41
+ />
42
+ </div>
43
+ <div class="skip-box">
44
+ <DelayTips
45
+ v-if="delayTimestamp > 200 || delayCount > 2"
46
+ :delayTimestamp="delayTimestamp"
47
+ :delayCount="delayCount"
48
+ />
49
+ <LikeAndDislike v-model:feedbackStatus="feedbackStatus" v-model:curResponseId="curResponseId" />
50
+ <SkipBtn :disabled="skipDisabled" @click="skipVoice" />
51
+ </div>
52
+ </div>
53
+ </div>
54
+ <div class="video-page-btn">
55
+ <el-button v-show="!isCalling" type="success" :disabled="callDisabled" @click="initRecording">
56
+ {{ callDisabled ? 'Not ready yet, please wait' : 'Call MiniCPM' }}
57
+ </el-button>
58
+ <el-button v-show="isCalling" @click="stopRecording" type="danger">
59
+ <SvgIcon name="phone-icon" className="phone-icon" />
60
+ <span class="btn-text">Hang Up</span>
61
+ <CountDown v-model="isCalling" @timeUp="stopRecording" />
62
+ </el-button>
63
+ </div>
64
+ <IdeasList v-if="showIdeasList" :ideasList="videoIdeasList" />
65
+ </div>
66
+ </template>
67
+ <script setup>
68
+ import { sendMessage, stopMessage, uploadConfig } from '@/apis';
69
+ import { encodeWAV } from '@/hooks/useVoice';
70
+ import { getNewUserId, setNewUserId } from '@/hooks/useRandomId';
71
+ import { fetchEventSource } from '@microsoft/fetch-event-source';
72
+ import { MicVAD } from '@ricky0123/vad-web';
73
+ import { videoIdeasList, videoConfigList, showIdeasList } from '@/enums';
74
+ import { isMobile, maxCount, getChunkLength } from '@/utils';
75
+ import { mergeBase64ToBlob } from './merge';
76
+ import WebSocketService from '@/utils/websocket';
77
+ let ctrl = new AbortController();
78
+ let socket = null;
79
+ const audioData = ref({
80
+ base64Str: '',
81
+ type: 'mp3'
82
+ }); // 自定义音色base64
83
+ const isCalling = defineModel();
84
+ const videoRef = ref();
85
+ const videoStream = ref(null);
86
+ const interval = ref();
87
+ const canvasRef = ref();
88
+ const videoImage = ref([]);
89
+ const videoLoaded = ref(false);
90
+ const taskQueue = ref([]);
91
+ const running = ref(false);
92
+ const outputData = ref([]);
93
+ const isFirstReturn = ref(true);
94
+ const audioPlayQueue = ref([]);
95
+ const base64List = ref([]);
96
+ const playing = ref(false);
97
+ const timbre = ref([1]);
98
+ const isReturnError = ref(false);
99
+
100
+ const textQueue = ref('');
101
+ const textAnimationInterval = ref();
102
+ const analyser = ref();
103
+ const dataArray = ref();
104
+ const animationFrameId = ref();
105
+ const skipDisabled = ref(true);
106
+ const stop = ref(false);
107
+ const isFrontCamera = ref(true);
108
+ const loading = ref(false);
109
+ const isEnd = ref(false); // sse接口关闭,认为模型已完成本次返回
110
+ const isFirstPiece = ref(true);
111
+ const allVoice = ref([]);
112
+ const callDisabled = ref(true);
113
+ const feedbackStatus = ref('');
114
+ const curResponseId = ref('');
115
+ const delayTimestamp = ref(0); // 当前发送片延时
116
+ const delayCount = ref(0); // 当前剩余多少ms未发送到接口
117
+
118
+ const modelVersion = ref('');
119
+
120
+ let mediaStream;
121
+ let audioRecorder;
122
+ let audioStream;
123
+ let audioContext;
124
+ let audioChunks = [];
125
+ let count = 0;
126
+ let audioDOM;
127
+ onBeforeUnmount(() => {
128
+ stopRecording();
129
+ });
130
+ const vadStartTime = ref();
131
+ let myvad = null;
132
+ let vadTimer = null; // vad定时器,用于检测1s内人声是否停止,1s内停止,可认为是vad误触,直接忽略,1s内未停止,则认为是人声,已自动跳过当前对话
133
+ const vadStart = async () => {
134
+ myvad = await MicVAD.new({
135
+ onSpeechStart: () => {
136
+ console.log('Speech start', +new Date());
137
+ if (!skipDisabled.value) {
138
+ vadTimer && clearTimeout(vadTimer);
139
+ vadTimer = setTimeout(() => {
140
+ // vadStartTime.value = +new Date();
141
+ console.log('打断时间: ', +new Date());
142
+ skipVoice();
143
+ }, 1000);
144
+ }
145
+ },
146
+ onSpeechEnd: audio => {
147
+ vadTimer && clearTimeout(vadTimer);
148
+ console.log('Speech end', +new Date());
149
+ // debugger;
150
+ // do something with `audio` (Float32Array of audio samples at sample rate 16000)...
151
+ }
152
+ });
153
+ myvad.start();
154
+ };
155
+ onMounted(async () => {
156
+ const { code, message } = await stopMessage();
157
+ if (code !== 0) {
158
+ ElMessage({
159
+ type: 'error',
160
+ message: message,
161
+ duration: 3000,
162
+ customClass: 'system-error'
163
+ });
164
+ return;
165
+ }
166
+ callDisabled.value = false;
167
+ });
168
+ const delay = ms => {
169
+ return new Promise(resolve => setTimeout(resolve, ms));
170
+ };
171
+ const initRecording = async () => {
172
+ uploadUserConfig()
173
+ .then(async () => {
174
+ if (!audioDOM) {
175
+ audioDOM = new Audio();
176
+ audioDOM.playsinline = true;
177
+ audioDOM.preload = 'auto';
178
+ }
179
+ // 每次call都需要生成新uid
180
+ setNewUserId();
181
+ buildConnect();
182
+ await delay(100);
183
+ initVideoStream('environment');
184
+ if (socket) {
185
+ socket.close();
186
+ }
187
+ socket = new WebSocketService(
188
+ `/ws/stream${window.location.search}&uid=${getNewUserId()}&service=minicpmo-server`
189
+ );
190
+ socket.connect();
191
+ initVideoStream('environment');
192
+ if (localStorage.getItem('canStopByVoice') === 'true') {
193
+ vadStart();
194
+ }
195
+ })
196
+ .catch(() => {});
197
+ };
198
+ // 切换摄像头
199
+ const switchCamera = () => {
200
+ if (!isCalling.value) {
201
+ return;
202
+ }
203
+ isFrontCamera.value = !isFrontCamera.value;
204
+ const facingMode = isFrontCamera.value ? 'environment' : 'user'; // 'user' 前置, 'environment' 后置
205
+ initVideoStream(facingMode);
206
+ };
207
+ const initVideoStream = async facingMode => {
208
+ if (mediaStream) {
209
+ mediaStream.getTracks().forEach(track => track.stop());
210
+ videoStream.value = null;
211
+ }
212
+ outputData.value = [];
213
+ isCalling.value = true;
214
+ loading.value = true;
215
+ if (!videoStream.value) {
216
+ try {
217
+ mediaStream = await window.navigator.mediaDevices.getUserMedia({
218
+ video: { facingMode },
219
+ audio: true
220
+ });
221
+ console.log('mediaStream', mediaStream);
222
+ videoStream.value = mediaStream;
223
+ videoRef.value.srcObject = mediaStream;
224
+ loading.value = false;
225
+ console.log('打开后: ', +new Date());
226
+ // takePhotos();
227
+ audioContext = new (window.AudioContext || window.webkitAudioContext)({ sampleRate: 16000 });
228
+ console.log('samplate: ', audioContext);
229
+ const audioSource = audioContext.createMediaStreamSource(mediaStream);
230
+ interval.value = setInterval(() => dealImage(), 50);
231
+ // 创建 ScriptProcessorNode 用于捕获音频数据
232
+ const processor = audioContext.createScriptProcessor(256, 1, 1);
233
+ processor.onaudioprocess = event => {
234
+ if (!isCalling.value) return;
235
+ if (isReturnError.value) {
236
+ stopRecording();
237
+ return;
238
+ }
239
+ const data = event.inputBuffer.getChannelData(0);
240
+ audioChunks.push(new Float32Array(data));
241
+ // 检查是否已经收集到1秒钟的数据
242
+ const totalBufferLength = audioChunks.reduce((total, curr) => total + curr.length, 0);
243
+ // const chunkLength = audioContext.sampleRate;
244
+ const chunkLength = getChunkLength(audioContext.sampleRate);
245
+ if (totalBufferLength >= chunkLength) {
246
+ // 合并到一个完整的数据数组,并裁剪成1秒钟
247
+ const mergedBuffer = mergeBuffers(audioChunks, totalBufferLength);
248
+ const oneSecondBuffer = mergedBuffer.slice(0, audioContext.sampleRate);
249
+
250
+ // 保存并处理成WAV格式
251
+ addQueue(+new Date(), () => saveAudioChunk(oneSecondBuffer, +new Date()));
252
+
253
+ // 保留多余的数据备用
254
+ audioChunks = [mergedBuffer.slice(audioContext.sampleRate)];
255
+ }
256
+ };
257
+ analyser.value = audioContext.createAnalyser();
258
+ // 将音频节点连接到分析器
259
+ audioSource.connect(analyser.value);
260
+ // 分析器设置
261
+ analyser.value.fftSize = 256;
262
+ const bufferLength = analyser.value.frequencyBinCount;
263
+ dataArray.value = new Uint8Array(bufferLength);
264
+ // 开始绘制音波
265
+ drawBars();
266
+
267
+ audioSource.connect(processor);
268
+ processor.connect(audioContext.destination);
269
+ } catch {}
270
+ }
271
+ };
272
+ const drawText = async () => {
273
+ if (textQueue.value.length > 0) {
274
+ outputData.value[outputData.value.length - 1].text += textQueue.value[0];
275
+ textQueue.value = textQueue.value.slice(1);
276
+ } else {
277
+ cancelAnimationFrame(textAnimationInterval.value);
278
+ }
279
+ textAnimationInterval.value = requestAnimationFrame(drawText);
280
+ };
281
+ const getStopValue = () => {
282
+ return stop.value;
283
+ };
284
+ const getPlayingValue = () => {
285
+ return playing.value;
286
+ };
287
+ const getStopStatus = () => {
288
+ return localStorage.getItem('canStopByVoice') === 'true';
289
+ };
290
+ const saveAudioChunk = (buffer, timestamp) => {
291
+ return new Promise(resolve => {
292
+ if (!getStopStatus() && getPlayingValue()) {
293
+ resolve();
294
+ return;
295
+ }
296
+ const wavBlob = encodeWAV(buffer, audioContext.sampleRate);
297
+ let reader = new FileReader();
298
+ reader.readAsDataURL(wavBlob);
299
+ reader.onloadend = async function () {
300
+ let base64data = reader.result.split(',')[1];
301
+ const imgBase64 = videoImage.value[videoImage.value.length - 1]?.src;
302
+ if (!(base64data && imgBase64)) {
303
+ resolve();
304
+ return;
305
+ }
306
+ const strBase64 = imgBase64.split(',')[1];
307
+ count++;
308
+ let obj = {
309
+ messages: [
310
+ {
311
+ role: 'user',
312
+ content: [
313
+ {
314
+ type: 'input_audio',
315
+ input_audio: {
316
+ data: base64data,
317
+ format: 'wav',
318
+ timestamp: String(timestamp)
319
+ }
320
+ }
321
+ ]
322
+ }
323
+ ]
324
+ };
325
+ obj.messages[0].content.unshift({
326
+ type: 'image_data',
327
+ image_data: {
328
+ data: count === maxCount ? strBase64 : '',
329
+ type: 2
330
+ }
331
+ });
332
+ if (count === maxCount) {
333
+ count = 0;
334
+ }
335
+ socket.send(JSON.stringify(obj));
336
+ socket.on('message', data => {
337
+ console.log('message: ', data);
338
+ delayTimestamp.value = +new Date() - timestamp;
339
+ delayCount.value = taskQueue.value.length;
340
+ resolve();
341
+ });
342
+ // 将Base64音频数据发送到后端
343
+ // try {
344
+ // await sendMessage(obj);
345
+ // delayTimestamp.value = +new Date() - timestamp;
346
+ // delayCount.value = taskQueue.value.length;
347
+ // } catch (err) {}
348
+ // resolve();
349
+ };
350
+ });
351
+ };
352
+ const mergeBuffers = (buffers, length) => {
353
+ const result = new Float32Array(length);
354
+ let offset = 0;
355
+ for (let buffer of buffers) {
356
+ result.set(buffer, offset);
357
+ offset += buffer.length;
358
+ }
359
+ return result;
360
+ };
361
+ const stopRecording = () => {
362
+ isCalling.value = false;
363
+ clearInterval(interval.value);
364
+ interval.value = null;
365
+ if (audioRecorder && audioRecorder.state !== 'inactive') {
366
+ audioRecorder.stop();
367
+ }
368
+ if (animationFrameId.value) {
369
+ cancelAnimationFrame(animationFrameId.value);
370
+ }
371
+ if (audioContext && audioContext.state !== 'closed') {
372
+ audioContext.close();
373
+ }
374
+ destroyVideoStream();
375
+ taskQueue.value = [];
376
+ audioPlayQueue.value = [];
377
+ base64List.value = [];
378
+ ctrl.abort();
379
+ ctrl = new AbortController();
380
+ isReturnError.value = false;
381
+ skipDisabled.value = true;
382
+ playing.value = false;
383
+ audioDOM?.pause();
384
+ stopMessage();
385
+ if (socket) {
386
+ socket.close();
387
+ }
388
+ if (
389
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
390
+ outputData.value[outputData.value.length - 1].audio === '' &&
391
+ allVoice.value.length > 0
392
+ ) {
393
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
394
+ }
395
+ myvad && myvad.destroy();
396
+ };
397
+ // 建立连接
398
+ const buildConnect = () => {
399
+ const obj = {
400
+ messages: [
401
+ {
402
+ role: 'user',
403
+ content: [{ type: 'none' }]
404
+ }
405
+ ],
406
+ stream: true
407
+ };
408
+ isEnd.value = false;
409
+ ctrl.abort();
410
+ ctrl = new AbortController();
411
+ const url = `/api/v1/completions${window.location.search}`;
412
+
413
+ fetchEventSource(url, {
414
+ method: 'POST',
415
+ headers: {
416
+ 'Content-Type': 'application/json',
417
+ service: 'minicpmo-server',
418
+ uid: getNewUserId()
419
+ },
420
+ body: JSON.stringify(obj),
421
+ signal: ctrl.signal,
422
+ openWhenHidden: true,
423
+ async onopen(response) {
424
+ isFirstPiece.value = true;
425
+ isFirstReturn.value = true;
426
+ allVoice.value = [];
427
+ base64List.value = [];
428
+ console.log('onopen', response);
429
+ if (response.status !== 200) {
430
+ ElMessage({
431
+ type: 'error',
432
+ message: 'At limit. Please try again soon.',
433
+ duration: 3000,
434
+ customClass: 'system-error'
435
+ });
436
+ isReturnError.value = true;
437
+ } else {
438
+ isReturnError.value = false;
439
+ drawText();
440
+ }
441
+ },
442
+ onmessage(msg) {
443
+ const data = JSON.parse(msg.data);
444
+ if (data.response_id) {
445
+ curResponseId.value = data.response_id;
446
+ }
447
+ if (data.choices[0]?.text) {
448
+ textQueue.value += data.choices[0].text.replace('<end>', '');
449
+ console.warn('text return time -------------------------------', +new Date());
450
+ }
451
+ // 首次返回的是前端发给后端的音频片段,需要单独处理
452
+ if (isFirstReturn.value) {
453
+ console.log('第一次');
454
+ isFirstReturn.value = false;
455
+ // 如果后端返回的音频为空,需要重连
456
+ if (!data.choices[0].audio) {
457
+ buildConnect();
458
+ return;
459
+ }
460
+ outputData.value.push({
461
+ type: 'USER',
462
+ audio: `data:audio/wav;base64,${data.choices[0].audio}`
463
+ });
464
+ outputData.value.push({
465
+ type: 'BOT',
466
+ text: '',
467
+ audio: ''
468
+ });
469
+ return;
470
+ }
471
+ if (data.choices[0]?.audio) {
472
+ console.log('audio return time -------------------------------', +new Date());
473
+ if (!getStopValue() && isCalling.value) {
474
+ skipDisabled.value = false;
475
+ base64List.value.push(`data:audio/wav;base64,${data.choices[0].audio}`);
476
+ addAudioQueue(() => truePlay(data.choices[0].audio));
477
+ }
478
+ allVoice.value.push(`data:audio/wav;base64,${data.choices[0].audio}`);
479
+ } else {
480
+ // 发生异常了,直接重连
481
+ buildConnect();
482
+ }
483
+ if (data.choices[0].text.includes('<end>')) {
484
+ console.log('收到结束标记了:', +new Date());
485
+ if (
486
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
487
+ outputData.value[outputData.value.length - 1].audio === '' &&
488
+ allVoice.value.length > 0
489
+ ) {
490
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
491
+ }
492
+ }
493
+ },
494
+ onclose() {
495
+ console.log('onclose', +new Date());
496
+ isEnd.value = true;
497
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
498
+ // sse关闭后,如果待播放的音频列表为空,说明模型出错了,此次连接没有返回音频,则直接重连
499
+ vadStartTime.value = +new Date();
500
+ if (audioPlayQueue.value.length === 0) {
501
+ let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
502
+ console.log('taskQueue111111111: ', taskQueue.value, startIndex);
503
+ if (startIndex !== -1) {
504
+ taskQueue.value = taskQueue.value.slice(startIndex);
505
+ console.log('截取后长度:', taskQueue.value, vadStartTime.value);
506
+ }
507
+ buildConnect();
508
+ }
509
+ },
510
+ onerror(err) {
511
+ console.log('onerror', err);
512
+ ctrl.abort();
513
+ ctrl = new AbortController();
514
+ throw err;
515
+ }
516
+ });
517
+ };
518
+ // 返回的语音放到队列里,挨个播放
519
+ const addAudioQueue = async item => {
520
+ audioPlayQueue.value.push(item);
521
+ if (isFirstPiece.value) {
522
+ await delay(1500);
523
+ isFirstPiece.value = false;
524
+ }
525
+ if (audioPlayQueue.value.length > 0 && !playing.value) {
526
+ playing.value = true;
527
+ playAudio();
528
+ }
529
+ };
530
+ // 控制播放队列执行
531
+ const playAudio = () => {
532
+ console.log('剩余播放列表:', audioPlayQueue.value, +new Date());
533
+
534
+ if (!isEnd.value && base64List.value.length >= 2) {
535
+ const remainLen = base64List.value.length;
536
+ const blob = mergeBase64ToBlob(base64List.value);
537
+ audioDOM.src = blob;
538
+ audioDOM.play();
539
+ console.error('前期合并后播放开始时间: ', +new Date());
540
+ audioDOM.onended = () => {
541
+ console.error('前期合并后播放结束时间: ', +new Date());
542
+ base64List.value = base64List.value.slice(remainLen);
543
+ audioPlayQueue.value = audioPlayQueue.value.slice(remainLen);
544
+ playAudio();
545
+ };
546
+ return;
547
+ }
548
+ if (isEnd.value && base64List.value.length >= 2) {
549
+ const blob = mergeBase64ToBlob(base64List.value);
550
+ audioDOM.src = blob;
551
+ audioDOM.play();
552
+ console.error('合并后播放开始时间: ', +new Date());
553
+ audioDOM.onended = () => {
554
+ console.error('合并后播放结束时间: ', +new Date());
555
+ // URL.revokeObjectURL(url);
556
+ base64List.value = [];
557
+ audioPlayQueue.value = [];
558
+ playing.value = false;
559
+ skipDisabled.value = true;
560
+ if (isCalling.value && !isReturnError.value) {
561
+ // skipDisabled.value = true;
562
+ taskQueue.value = [];
563
+ // 打断前记录一下打断时间或vad触发事件
564
+ // vadStartTime.value = +new Date();
565
+ // // 每次完成后只保留当前时刻往前推1s的语音
566
+ // console.log(
567
+ // '截取前长度:',
568
+ // taskQueue.value.map(item => item.time)
569
+ // );
570
+ // let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
571
+ // if (startIndex !== -1) {
572
+ // taskQueue.value = taskQueue.value.slice(startIndex);
573
+ // console.log(
574
+ // '截取后长度:',
575
+ // taskQueue.value.map(item => item.time),
576
+ // vadStartTime.value
577
+ // );
578
+ // }
579
+ buildConnect();
580
+ }
581
+ };
582
+ return;
583
+ }
584
+ base64List.value.shift();
585
+ const _truePlay = audioPlayQueue.value.shift();
586
+ if (_truePlay) {
587
+ _truePlay().finally(() => {
588
+ playAudio();
589
+ });
590
+ } else {
591
+ playing.value = false;
592
+ if (isEnd.value) {
593
+ console.warn('play done................');
594
+ skipDisabled.value = true;
595
+ }
596
+ // 播放完成后且正在通话且接口未返回错误时开始下一次连接
597
+ if (isEnd.value && isCalling.value && !isReturnError.value) {
598
+ // skipDisabled.value = true;
599
+ taskQueue.value = [];
600
+ // // 跳过之后,只保留当前时间点两秒内到之后的音频片段
601
+ // vadStartTime.value = +new Date();
602
+ // console.log(
603
+ // '截取前长度:',
604
+ // taskQueue.value.map(item => item.time)
605
+ // );
606
+ // let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
607
+ // if (startIndex !== -1) {
608
+ // taskQueue.value = taskQueue.value.slice(startIndex);
609
+ // console.log(
610
+ // '截取后长度:',
611
+ // taskQueue.value.map(item => item.time),
612
+ // vadStartTime.value
613
+ // );
614
+ // }
615
+ buildConnect();
616
+ }
617
+ }
618
+ };
619
+ // 播放音频
620
+ const truePlay = voice => {
621
+ console.log('promise: ', +new Date());
622
+ return new Promise(resolve => {
623
+ audioDOM.src = 'data:audio/wav;base64,' + voice;
624
+ console.error('播放开始时间:', +new Date());
625
+ audioDOM
626
+ .play()
627
+ .then(() => {
628
+ console.log('Audio played successfully');
629
+ })
630
+ .catch(error => {
631
+ if (error.name === 'NotAllowedError' || error.name === 'SecurityError') {
632
+ console.error('User interaction required or permission issue:', error);
633
+ // ElMessage.warning('音频播放失败');
634
+ console.error('播放失败时间');
635
+ // alert('Please interact with the page (like clicking a button) to enable audio playback.');
636
+ } else {
637
+ console.error('Error playing audio:', error);
638
+ }
639
+ });
640
+ // .finally(() => {
641
+ // resolve();
642
+ // });
643
+ audioDOM.onerror = () => {
644
+ console.error('播放失败时间', +new Date());
645
+ resolve();
646
+ };
647
+ audioDOM.onended = () => {
648
+ console.error('播放结束时间: ', +new Date());
649
+ // URL.revokeObjectURL(url);
650
+ resolve();
651
+ };
652
+ });
653
+ };
654
+ // 当队列中任务数大于0时,开始处理队列中的任务
655
+ const addQueue = (time, item) => {
656
+ taskQueue.value.push({ func: item, time });
657
+ if (taskQueue.value.length > 0 && !running.value) {
658
+ running.value = true;
659
+ processQueue();
660
+ }
661
+ };
662
+ const processQueue = () => {
663
+ const item = taskQueue.value.shift();
664
+ if (item?.func) {
665
+ item.func()
666
+ .then(res => {
667
+ console.log('已处理事件: ', res);
668
+ })
669
+ .finally(() => processQueue());
670
+ } else {
671
+ running.value = false;
672
+ }
673
+ };
674
+ const destroyVideoStream = () => {
675
+ videoStream.value?.getTracks().forEach(track => track.stop());
676
+ videoStream.value = null;
677
+ // 将srcObject设置为null以切断与MediaStream 对象的链接,以便将其释放
678
+ videoRef.value.srcObject = null;
679
+
680
+ videoImage.value = [];
681
+ videoLoaded.value = false;
682
+
683
+ clearInterval(interval.value);
684
+ interval.value = null;
685
+ };
686
+ const dealImage = () => {
687
+ if (!videoRef.value) {
688
+ return;
689
+ }
690
+ const canvas = canvasRef.value;
691
+ canvasRef.value.width = videoRef.value.videoWidth;
692
+ canvasRef.value.height = videoRef.value.videoHeight;
693
+ const context = canvas.getContext('2d');
694
+ context.drawImage(videoRef.value, 0, 0, canvasRef.value.width, canvasRef.value.height);
695
+ const imageDataUrl = canvas.toDataURL('image/webp', 0.8);
696
+ videoImage.value.push({ src: imageDataUrl });
697
+ };
698
+ const drawBars = () => {
699
+ // AnalyserNode接口的 getByteFrequencyData() 方法将当前频率数据复制到传入的 Uint8Array(无符号字节数组)中。
700
+ analyser.value.getByteFrequencyData(dataArray.value);
701
+ animationFrameId.value = requestAnimationFrame(drawBars);
702
+ };
703
+ // 跳过当前片段
704
+ const skipVoice = async () => {
705
+ // 打断前记录一下打断时间或vad触发事件
706
+ vadStartTime.value = +new Date();
707
+ if (!skipDisabled.value) {
708
+ if (
709
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
710
+ outputData.value[outputData.value.length - 1].audio === ''
711
+ ) {
712
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
713
+ }
714
+ base64List.value = [];
715
+ audioPlayQueue.value = [];
716
+ // 跳过之后,只保留当前时间点两秒内到之后的音频片段
717
+ console.log(
718
+ '截取前长度:',
719
+ taskQueue.value.map(item => item.time)
720
+ );
721
+ let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
722
+ if (startIndex !== -1) {
723
+ taskQueue.value = taskQueue.value.slice(startIndex);
724
+ console.log(
725
+ '截取后长度:',
726
+ taskQueue.value.map(item => item.time),
727
+ vadStartTime.value
728
+ );
729
+ }
730
+ stop.value = true;
731
+ audioDOM?.pause();
732
+ setTimeout(() => {
733
+ skipDisabled.value = true;
734
+ }, 300);
735
+ try {
736
+ playing.value = false;
737
+ await stopMessage();
738
+ stop.value = false;
739
+ // playing.value = false;
740
+ buildConnect();
741
+ // cancelAnimationFrame(animationFrameId.value);
742
+ } catch (err) {}
743
+ }
744
+ };
745
+ // 每次call先上传当前用户配置
746
+ const uploadUserConfig = async () => {
747
+ if (!localStorage.getItem('configData')) {
748
+ return new Promise(resolve => resolve());
749
+ }
750
+ const {
751
+ videoQuality,
752
+ useAudioPrompt,
753
+ voiceClonePrompt,
754
+ assistantPrompt,
755
+ vadThreshold,
756
+ audioFormat,
757
+ base64Str
758
+ } = JSON.parse(localStorage.getItem('configData'));
759
+ const obj = {
760
+ messages: [
761
+ {
762
+ role: 'user',
763
+ content: [
764
+ {
765
+ type: 'input_audio',
766
+ input_audio: {
767
+ data: base64Str,
768
+ format: audioFormat
769
+ }
770
+ },
771
+ {
772
+ type: 'options',
773
+ options: {
774
+ hd_video: videoQuality,
775
+ use_audio_prompt: useAudioPrompt,
776
+ vad_threshold: vadThreshold,
777
+ voice_clone_prompt: voiceClonePrompt,
778
+ assistant_prompt: assistantPrompt
779
+ }
780
+ }
781
+ ]
782
+ }
783
+ ]
784
+ };
785
+ const { code, message, data } = await uploadConfig(obj);
786
+ modelVersion.value = data?.choices?.content || '';
787
+ return new Promise((resolve, reject) => {
788
+ if (code !== 0) {
789
+ ElMessage({
790
+ type: 'error',
791
+ message: message,
792
+ duration: 3000,
793
+ customClass: 'system-error'
794
+ });
795
+ reject();
796
+ } else {
797
+ resolve();
798
+ }
799
+ });
800
+ };
801
+ </script>
802
+ <style lang="less">
803
+ .video-page {
804
+ height: 100%;
805
+ display: flex;
806
+ flex-direction: column;
807
+ &-header {
808
+ display: flex;
809
+ align-items: center;
810
+ padding: 0 16px 16px;
811
+ box-shadow: 0 0.5px 0 0 #e0e0e0;
812
+ margin-bottom: 16px;
813
+ justify-content: space-between;
814
+ .header-icon {
815
+ display: flex;
816
+ align-items: center;
817
+ img {
818
+ width: 24px;
819
+ height: 24px;
820
+ margin-right: 8px;
821
+ }
822
+ span {
823
+ color: rgba(23, 23, 23, 0.9);
824
+ font-family: PingFang SC;
825
+ font-size: 16px;
826
+ font-style: normal;
827
+ font-weight: 500;
828
+ line-height: normal;
829
+ margin-right: 40px;
830
+ flex-shrink: 0;
831
+ }
832
+ }
833
+ .voice-container {
834
+ display: flex;
835
+ .voice-icon {
836
+ width: 191px;
837
+ height: 45px;
838
+ }
839
+ }
840
+ }
841
+ &-content {
842
+ flex: 1;
843
+ margin-bottom: 16px;
844
+ display: flex;
845
+ height: 0;
846
+ &-video {
847
+ width: 50%;
848
+ height: 100%;
849
+ background: #f3f3f3;
850
+ flex-shrink: 0;
851
+ position: relative;
852
+ video {
853
+ width: 100%;
854
+ height: 100%;
855
+ object-fit: contain;
856
+ }
857
+ .switch-camera {
858
+ position: absolute;
859
+ top: 10px;
860
+ right: 10px;
861
+ width: 36px;
862
+ height: 36px;
863
+ background: #ffffff;
864
+ border-radius: 6px;
865
+ display: flex;
866
+ justify-content: center;
867
+ align-items: center;
868
+ font-size: 24px;
869
+ z-index: 999;
870
+ .icon {
871
+ width: 20px;
872
+ height: 20px;
873
+ }
874
+ }
875
+ }
876
+ &-right {
877
+ margin-left: 16px;
878
+ flex: 1;
879
+ padding: 0 16px;
880
+ display: flex;
881
+ flex-direction: column;
882
+ .output-content {
883
+ flex: 1;
884
+ overflow: auto;
885
+ }
886
+ .skip-box {
887
+ display: flex;
888
+ align-items: center;
889
+ justify-content: flex-end;
890
+ margin-top: 16px;
891
+ }
892
+ }
893
+ }
894
+ &-btn {
895
+ text-align: center;
896
+ padding: 8px 0;
897
+ .el-button {
898
+ width: 284px;
899
+ height: 46px;
900
+ border-radius: 8px;
901
+ }
902
+ .el-button.el-button--success {
903
+ background: #647fff;
904
+ border-color: #647fff;
905
+ &:hover {
906
+ opacity: 0.8;
907
+ }
908
+ span {
909
+ color: #fff;
910
+ font-family: PingFang SC;
911
+ font-size: 16px;
912
+ font-style: normal;
913
+ font-weight: 500;
914
+ line-height: normal;
915
+ }
916
+ }
917
+ .el-button.el-button--success.is-disabled {
918
+ background: #f3f3f3;
919
+ border-color: #f3f3f3;
920
+ span {
921
+ color: #d1d1d1;
922
+ }
923
+ }
924
+ .el-button.el-button--danger {
925
+ border-color: #dc3545;
926
+ background-color: #dc3545;
927
+ color: #ffffff;
928
+ font-family: PingFang SC;
929
+ font-size: 16px;
930
+ font-style: normal;
931
+ font-weight: 500;
932
+ line-height: normal;
933
+ .phone-icon {
934
+ margin-right: 10px;
935
+ }
936
+ .btn-text {
937
+ margin-right: 10px;
938
+ }
939
+ .btn-desc {
940
+ margin-right: 16px;
941
+ }
942
+ }
943
+ }
944
+ }
945
+ .video-size {
946
+ position: absolute;
947
+ bottom: 10px;
948
+ right: 10px;
949
+ background: rgba(0, 0, 0, 0.5);
950
+ color: #fff;
951
+ padding: 4px 8px;
952
+ border-radius: 4px;
953
+ font-size: 12px;
954
+ }
955
+ </style>
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/VoiceCall.vue ADDED
@@ -0,0 +1,833 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <template>
2
+ <!-- <ExtraInfo webVersion="非websocket_0112" :modelVersion="modelVersion" /> -->
3
+ <div class="voice-page">
4
+ <div class="voice-page-header">
5
+ <div class="voice-container" v-if="!isCalling">
6
+ <SvgIcon name="voice" class="voice-icon" />
7
+ <SvgIcon name="voice" class="voice-icon" />
8
+ <SvgIcon name="voice" class="voice-icon" />
9
+ </div>
10
+ <div class="voice-container" v-else>
11
+ <Voice
12
+ :dataArray="dataArray"
13
+ :isCalling="isCalling"
14
+ :isPlaying="playing"
15
+ :configList="videoConfigList"
16
+ :boxStyle="{ height: '45px' }"
17
+ :itemStyle="{ width: '3px', margin: '0 1px' }"
18
+ />
19
+ </div>
20
+ <!-- <SelectTimbre v-model:timbre="timbre" v-model:audioData="audioData" v-model:disabled="isCalling" /> -->
21
+ </div>
22
+ <div class="voice-page-output">
23
+ <div class="output-content">
24
+ <ModelOutput v-if="outputData.length > 0" :outputData="outputData" containerClass="output-content" />
25
+ </div>
26
+ <div class="skip-box">
27
+ <!-- <DelayTips
28
+ v-if="delayTimestamp > 200 || delayCount > 2"
29
+ :delayTimestamp="delayTimestamp"
30
+ :delayCount="delayCount"
31
+ /> -->
32
+ <LikeAndDislike v-model:feedbackStatus="feedbackStatus" v-model:curResponseId="curResponseId" />
33
+ <SkipBtn :disabled="skipDisabled" @click="skipVoice" />
34
+ </div>
35
+ </div>
36
+ <div class="voice-page-btn">
37
+ <el-button v-show="!isCalling" type="success" :disabled="callDisabled" @click="initRecording">
38
+ {{ callDisabled ? t('notReadyBtn') : t('audioCallBtn') }}
39
+ </el-button>
40
+ <el-button v-show="isCalling" @click="stopRecording" type="danger">
41
+ <SvgIcon name="phone-icon" className="phone-icon" />
42
+ <span class="btn-text">{{ t('hangUpBtn') }}</span>
43
+ <CountDown v-model="isCalling" @timeUp="stopRecording" />
44
+ </el-button>
45
+ </div>
46
+ <IdeasList v-if="showIdeasList" :ideasList="voiceIdeasList" />
47
+ </div>
48
+ </template>
49
+ <script setup>
50
+ import { sendMessage, stopMessage, uploadConfig } from '@/apis';
51
+ import { encodeWAV } from '@/hooks/useVoice';
52
+ import { getNewUserId, setNewUserId } from '@/hooks/useRandomId';
53
+ import { fetchEventSource } from '@microsoft/fetch-event-source';
54
+ import { MicVAD } from '@ricky0123/vad-web';
55
+ import { videoConfigList, voiceConfigList, voiceIdeasList, showIdeasList } from '@/enums';
56
+ import { getChunkLength } from '@/utils';
57
+ import { mergeBase64ToBlob } from './merge';
58
+ import WebSocketService from '@/utils/websocket';
59
+ import { useI18n } from 'vue-i18n';
60
+
61
+ const { t } = useI18n();
62
+
63
+ let ctrl = new AbortController();
64
+ let socket = null;
65
+ const audioData = ref({
66
+ base64Str: '',
67
+ type: 'mp3'
68
+ }); // 自定义音色base64
69
+ const isCalling = defineModel();
70
+ const taskQueue = ref([]);
71
+ const running = ref(false);
72
+ const outputData = ref([]);
73
+ const textQueue = ref('');
74
+ const textAnimationInterval = ref();
75
+
76
+ const isFirstReturn = ref(true); // 首次返回的音频是前端发给后端的音频片段,需要单独处理
77
+
78
+ const audioPlayQueue = ref([]);
79
+ const base64List = ref([]);
80
+ const playing = ref(false);
81
+ const skipDisabled = ref(true);
82
+ const stop = ref(false);
83
+ const timbre = ref([1]);
84
+ const isReturnError = ref(false);
85
+ const allVoice = ref([]);
86
+ const callDisabled = ref(true);
87
+
88
+ const feedbackStatus = ref('');
89
+ const curResponseId = ref('');
90
+ const delayTimestamp = ref(0); // 当前发送片延时
91
+ const delayCount = ref(0); // 当前剩余多少ms未发送到接口
92
+
93
+ const modelVersion = ref('');
94
+
95
+ let audioDOM = new Audio();
96
+
97
+ const isEnd = ref(false); // sse接口关闭,认为模型已完成本次返回
98
+ // 页面卸载时关闭录音
99
+ onBeforeUnmount(() => {
100
+ stopRecording();
101
+ });
102
+ const vadStartTime = ref();
103
+ let myvad = null;
104
+ let vadTimer = null; // vad定时器,用于检测1s内人声是否停止,1s内停止,可认为是vad误触,直接忽略,1s内未停止,则认为是人声,已自动跳过当前对话
105
+ const vadStart = async () => {
106
+ myvad = await MicVAD.new({
107
+ onSpeechStart: () => {
108
+ console.log('Speech start detected');
109
+ // if (!skipDisabled.value) {
110
+ vadTimer && clearTimeout(vadTimer);
111
+ vadTimer = setTimeout(() => {
112
+ console.log('打断时间: ', +new Date());
113
+ skipVoice();
114
+ }, 500);
115
+ // }
116
+ },
117
+ onSpeechEnd: audio => {
118
+ vadTimer && clearTimeout(vadTimer);
119
+ // debugger;
120
+ // do something with `audio` (Float32Array of audio samples at sample rate 16000)...
121
+ },
122
+ baseAssetPath: '/'
123
+ });
124
+ console.log('vad: ', myvad);
125
+ myvad.start();
126
+ };
127
+ onMounted(async () => {
128
+ const { code, message } = await stopMessage();
129
+ if (code !== 0) {
130
+ ElMessage({
131
+ type: 'error',
132
+ message: message,
133
+ duration: 3000,
134
+ customClass: 'system-error'
135
+ });
136
+ return;
137
+ }
138
+ callDisabled.value = false;
139
+ });
140
+ const delay = ms => {
141
+ return new Promise(resolve => setTimeout(resolve, ms));
142
+ };
143
+ const initRecording = async () => {
144
+ uploadUserConfig()
145
+ .then(async () => {
146
+ // 每次call都需要生成新uid
147
+ setNewUserId();
148
+
149
+ outputData.value = [];
150
+ buildConnect();
151
+ isCalling.value = true;
152
+ await delay(100);
153
+ // if (socket) {
154
+ // socket.close();
155
+ // }
156
+ // socket = new WebSocketService(
157
+ // `/ws/stream${window.location.search}&uid=${getNewUserId()}&service=minicpmo-server`
158
+ // );
159
+ // socket.connect();
160
+ // 建立连接后稍等一会儿再传送数据
161
+ startRecording();
162
+ if (localStorage.getItem('canStopByVoice') === 'true') {
163
+ vadStart();
164
+ }
165
+ })
166
+ .catch(() => {});
167
+ };
168
+ let audioContext;
169
+ const analyser = ref();
170
+ const dataArray = ref();
171
+ let mediaRecorder;
172
+ let audioChunks = [];
173
+ const animationFrameId = ref();
174
+
175
+ const isFirstPiece = ref(true);
176
+
177
+ const startRecording = async () => {
178
+ // 获取用户音频流
179
+ const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
180
+
181
+ // 创建 AudioContext 和 MediaStreamAudioSourceNode
182
+ audioContext = new (window.AudioContext || window.webkitAudioContext)({ sampleRate: 16000 });
183
+ const source = audioContext.createMediaStreamSource(stream);
184
+
185
+ analyser.value = audioContext.createAnalyser();
186
+ // 将音频节点连接到分析器
187
+ source.connect(analyser.value);
188
+ // 分析器设置
189
+ analyser.value.fftSize = 256;
190
+ const bufferLength = analyser.value.frequencyBinCount;
191
+ dataArray.value = new Uint8Array(bufferLength);
192
+ // 开始绘制音波
193
+ drawBars();
194
+
195
+ // 创建 ScriptProcessorNode 用于捕获音频数据
196
+ const processor = audioContext.createScriptProcessor(256, 1, 1);
197
+
198
+ processor.onaudioprocess = event => {
199
+ if (!isCalling.value) return;
200
+ if (isReturnError.value) {
201
+ stopRecording();
202
+ return;
203
+ }
204
+ const data = event.inputBuffer.getChannelData(0);
205
+ audioChunks.push(new Float32Array(data));
206
+ // 检查是否已经收集到1秒钟的数据
207
+ const totalBufferLength = audioChunks.reduce((total, curr) => total + curr.length, 0);
208
+ const chunkLength = getChunkLength(audioContext.sampleRate);
209
+ if (totalBufferLength >= chunkLength) {
210
+ // 合并到一个完整的数据数组,并裁剪成1秒钟
211
+ const mergedBuffer = mergeBuffers(audioChunks, totalBufferLength);
212
+ const oneSecondBuffer = mergedBuffer.slice(0, chunkLength);
213
+ // 保存并处理成WAV格式
214
+ addQueue(+new Date(), () => saveAudioChunk(oneSecondBuffer, +new Date()));
215
+ // 保留多余的数据备用
216
+ audioChunks = [mergedBuffer.slice(chunkLength)];
217
+ }
218
+ };
219
+
220
+ source.connect(processor);
221
+ processor.connect(audioContext.destination);
222
+ };
223
+ const stopRecording = () => {
224
+ isCalling.value = false;
225
+ if (animationFrameId.value) {
226
+ cancelAnimationFrame(animationFrameId.value);
227
+ }
228
+ if (audioContext && audioContext.state !== 'closed') {
229
+ audioContext.close();
230
+ }
231
+ ctrl.abort();
232
+ ctrl = new AbortController();
233
+ taskQueue.value = [];
234
+ audioPlayQueue.value = [];
235
+ base64List.value = [];
236
+ isReturnError.value = false;
237
+ skipDisabled.value = true;
238
+ playing.value = false;
239
+ audioDOM.pause();
240
+ stopMessage();
241
+ if (socket) {
242
+ socket.close();
243
+ }
244
+ if (
245
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
246
+ outputData.value[outputData.value.length - 1].audio === '' &&
247
+ allVoice.value.length > 0
248
+ ) {
249
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
250
+ }
251
+ myvad && myvad.destroy();
252
+ };
253
+ const getStopValue = () => {
254
+ return stop.value;
255
+ };
256
+ const getPlayingValue = () => {
257
+ return playing.value;
258
+ };
259
+ const getStopStatus = () => {
260
+ return localStorage.getItem('canStopByVoice') === 'true';
261
+ };
262
+ const saveAudioChunk = (buffer, timestamp) => {
263
+ return new Promise(resolve => {
264
+ if (!getStopStatus() && getPlayingValue()) {
265
+ resolve();
266
+ return;
267
+ }
268
+ const wavBlob = encodeWAV(buffer, audioContext.sampleRate);
269
+ let reader = new FileReader();
270
+ reader.readAsDataURL(wavBlob);
271
+
272
+ reader.onloadend = async function () {
273
+ let base64data = reader.result.split(',')[1];
274
+ if (!base64data) {
275
+ resolve();
276
+ return;
277
+ }
278
+ const obj = {
279
+ uid: getNewUserId(),
280
+ messages: [
281
+ {
282
+ role: 'user',
283
+ content: [
284
+ {
285
+ type: 'input_audio',
286
+ input_audio: {
287
+ data: base64data,
288
+ format: 'wav',
289
+ timestamp: String(timestamp)
290
+ }
291
+ }
292
+ ]
293
+ }
294
+ ]
295
+ };
296
+ // socket.send(JSON.stringify(obj));
297
+ // socket.on('message', data => {
298
+ // console.log('message: ', data);
299
+ // delayTimestamp.value = +new Date() - timestamp;
300
+ // delayCount.value = taskQueue.value.length;
301
+ // resolve();
302
+ // });
303
+ // 将Base64音频数据发送到后端
304
+ try {
305
+ await sendMessage(obj);
306
+ delayTimestamp.value = +new Date() - timestamp;
307
+ delayCount.value = taskQueue.value.length;
308
+ } catch (err) {}
309
+ resolve();
310
+ };
311
+ });
312
+ };
313
+ const mergeBuffers = (buffers, length) => {
314
+ const result = new Float32Array(length);
315
+ let offset = 0;
316
+ for (let buffer of buffers) {
317
+ result.set(buffer, offset);
318
+ offset += buffer.length;
319
+ }
320
+ return result;
321
+ };
322
+ // 建立连接
323
+ const buildConnect = async () => {
324
+ const obj = {
325
+ messages: [
326
+ {
327
+ role: 'user',
328
+ content: [{ type: 'none' }]
329
+ }
330
+ ],
331
+ stream: true
332
+ };
333
+ isEnd.value = false;
334
+ ctrl.abort();
335
+ ctrl = new AbortController();
336
+ const url = `/api/v1/completions${window.location.search}`;
337
+ fetchEventSource(url, {
338
+ method: 'POST',
339
+ headers: {
340
+ 'Content-Type': 'application/json',
341
+ service: 'minicpmo-server',
342
+ uid: getNewUserId()
343
+ },
344
+ body: JSON.stringify(obj),
345
+ signal: ctrl.signal,
346
+ openWhenHidden: true,
347
+ async onopen(response) {
348
+ console.log('onopen', response);
349
+ isFirstPiece.value = true;
350
+ isFirstReturn.value = true;
351
+ allVoice.value = [];
352
+ base64List.value = [];
353
+ if (response.status !== 200) {
354
+ ElMessage({
355
+ type: 'error',
356
+ message: 'At limit. Please try again soon.',
357
+ duration: 3000,
358
+ customClass: 'system-error'
359
+ });
360
+ isReturnError.value = true;
361
+ } else {
362
+ isReturnError.value = false;
363
+ // skipDisabled.value = false;
364
+ drawText();
365
+ }
366
+ },
367
+ onmessage(msg) {
368
+ const data = JSON.parse(msg.data);
369
+ if (data.response_id) {
370
+ curResponseId.value = data.response_id;
371
+ }
372
+ if (data.choices[0]?.text) {
373
+ textQueue.value += data.choices[0].text.replace('<end>', '');
374
+ console.warn('text return time -------------------------------', +new Date());
375
+ }
376
+ // 首次返回的是前端发给后端的音频片段,需要单独处理
377
+ if (isFirstReturn.value) {
378
+ console.log('第一次');
379
+ isFirstReturn.value = false;
380
+ // 如果后端返回的音频为空,需要重连
381
+ if (!data.choices[0].audio) {
382
+ buildConnect();
383
+ return;
384
+ }
385
+ outputData.value.push({
386
+ type: 'USER',
387
+ audio: `data:audio/wav;base64,${data.choices[0].audio}`
388
+ });
389
+ outputData.value.push({
390
+ type: 'BOT',
391
+ text: '',
392
+ audio: ''
393
+ });
394
+ return;
395
+ }
396
+ if (data.choices[0]?.audio) {
397
+ console.warn('audio return time -------------------------------', +new Date());
398
+ if (!getStopValue() && isCalling.value) {
399
+ skipDisabled.value = false;
400
+ base64List.value.push(`data:audio/wav;base64,${data.choices[0].audio}`);
401
+ addAudioQueue(() => truePlay(data.choices[0].audio));
402
+ }
403
+ allVoice.value.push(`data:audio/wav;base64,${data.choices[0].audio}`);
404
+ } else {
405
+ // 发生异常了,直接重连
406
+ buildConnect();
407
+ }
408
+ if (data.choices[0].text.includes('<end>')) {
409
+ // isEnd.value = true;
410
+ console.log('收到结束标记了:', +new Date());
411
+ if (
412
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
413
+ outputData.value[outputData.value.length - 1].audio === '' &&
414
+ allVoice.value.length > 0
415
+ ) {
416
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
417
+ }
418
+ }
419
+ },
420
+ onclose() {
421
+ console.log('onclose', +new Date());
422
+ isEnd.value = true;
423
+ if (
424
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
425
+ outputData.value[outputData.value.length - 1].audio === '' &&
426
+ allVoice.value.length > 0
427
+ ) {
428
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
429
+ }
430
+ vadStartTime.value = +new Date();
431
+ if (audioPlayQueue.value.length === 0) {
432
+ let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 2000);
433
+ console.log('taskQueue111111111: ', taskQueue.value, startIndex);
434
+ if (startIndex !== -1) {
435
+ taskQueue.value = taskQueue.value.slice(startIndex);
436
+ console.log('截取后长度:', taskQueue.value, vadStartTime.value);
437
+ }
438
+ buildConnect();
439
+ }
440
+ },
441
+ onerror(err) {
442
+ console.log('onerror', err);
443
+ ctrl.abort();
444
+ ctrl = new AbortController();
445
+ throw err;
446
+ }
447
+ });
448
+ };
449
+ const drawText = async () => {
450
+ if (textQueue.value.length > 0) {
451
+ outputData.value[outputData.value.length - 1].text += textQueue.value[0];
452
+ textQueue.value = textQueue.value.slice(1);
453
+ } else {
454
+ cancelAnimationFrame(textAnimationInterval.value);
455
+ }
456
+ textAnimationInterval.value = requestAnimationFrame(drawText);
457
+ };
458
+ // 返回的语音放到队列里,挨个播放
459
+ const addAudioQueue = async item => {
460
+ audioPlayQueue.value.push(item);
461
+ if (isFirstPiece.value) {
462
+ await delay(500);
463
+ isFirstPiece.value = false;
464
+ }
465
+ if (audioPlayQueue.value.length > 0 && !playing.value) {
466
+ playing.value = true;
467
+ playAudio();
468
+ }
469
+ };
470
+ // 控制播放队列执行
471
+ const playAudio = () => {
472
+ console.log('剩余播放列表:', audioPlayQueue.value, +new Date());
473
+
474
+ if (!isEnd.value && base64List.value.length >= 2) {
475
+ const remainLen = base64List.value.length;
476
+ const blob = mergeBase64ToBlob(base64List.value);
477
+ audioDOM.src = blob;
478
+ audioDOM.play();
479
+ console.error('前期合并后播放开始时间: ', +new Date());
480
+ audioDOM.onended = () => {
481
+ console.error('前期合并后播放结束时间: ', +new Date());
482
+ base64List.value = base64List.value.slice(remainLen);
483
+ audioPlayQueue.value = audioPlayQueue.value.slice(remainLen);
484
+ playAudio();
485
+ };
486
+ return;
487
+ }
488
+ if (isEnd.value && base64List.value.length >= 2) {
489
+ const blob = mergeBase64ToBlob(base64List.value);
490
+ // let audio = new Audio();
491
+ audioDOM.src = blob;
492
+ audioDOM.play();
493
+ console.error('最后合并后播放开始时间: ', +new Date());
494
+ audioDOM.onended = () => {
495
+ console.error('合并后播放结束时间: ', +new Date());
496
+ // URL.revokeObjectURL(url);
497
+ base64List.value = [];
498
+ audioPlayQueue.value = [];
499
+ playing.value = false;
500
+ skipDisabled.value = true;
501
+ if (isCalling.value && !isReturnError.value) {
502
+ // skipDisabled.value = true;
503
+ taskQueue.value = [];
504
+ // 打断前记录一下打断时间或vad触发事件
505
+ // vadStartTime.value = +new Date();
506
+ // // 每次完成后只保留当前时刻往前推1s的语音
507
+ // console.log('截取前长度:', JSON.parse(JSON.stringify(taskQueue.value.map(item => item.time))));
508
+ // let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 2000);
509
+ // if (startIndex !== -1) {
510
+ // taskQueue.value = taskQueue.value.slice(startIndex);
511
+ // console.log(
512
+ // '截取后长度:',
513
+ // taskQueue.value.map(item => item.time),
514
+ // vadStartTime.value
515
+ // );
516
+ // }
517
+ buildConnect();
518
+ }
519
+ };
520
+ return;
521
+ }
522
+ base64List.value.shift();
523
+ const item = audioPlayQueue.value.shift();
524
+ if (item) {
525
+ item().finally(() => playAudio());
526
+ } else {
527
+ playing.value = false;
528
+ if (isEnd.value) {
529
+ console.warn('play done................');
530
+ skipDisabled.value = true;
531
+ }
532
+ // 播放完成后且正在通话且接口未返回错误时开始下一次连接
533
+ if (isEnd.value && isCalling.value && !isReturnError.value) {
534
+ // skipDisabled.value = true;
535
+ taskQueue.value = [];
536
+ // 打断前记录一下打断时间或vad触发事件
537
+ // vadStartTime.value = +new Date();
538
+ // // 每次完成后只保留当前时刻往前推1s的语音
539
+ // console.log(
540
+ // '截取前长度:',
541
+ // taskQueue.value.map(item => item.time)
542
+ // );
543
+ // let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 2000);
544
+ // if (startIndex !== -1) {
545
+ // taskQueue.value = taskQueue.value.slice(startIndex);
546
+ // console.log(
547
+ // '截取后长度:',
548
+ // taskQueue.value.map(item => item.time),
549
+ // vadStartTime.value
550
+ // );
551
+ // }
552
+ buildConnect();
553
+ }
554
+ }
555
+ };
556
+
557
+ // 播放音频
558
+ const truePlay = async voice => {
559
+ return new Promise(resolve => {
560
+ audioDOM.src = 'data:audio/wav;base64,' + voice;
561
+ console.error('播放开始时间:', +new Date());
562
+ audioDOM
563
+ .play()
564
+ .then(() => {
565
+ // console.error('播放结束时间: ', +new Date());
566
+ })
567
+ .catch(error => {
568
+ resolve();
569
+ if (error.name === 'NotAllowedError' || error.name === 'SecurityError') {
570
+ console.error('User interaction required or permission issue:', error);
571
+ ElMessage.warning('音频播放失败');
572
+ } else {
573
+ console.error('Error playing audio:', error);
574
+ }
575
+ });
576
+
577
+ audioDOM.onended = () => {
578
+ console.error('播放结束时间: ', +new Date());
579
+ // URL.revokeObjectURL(url);
580
+ resolve();
581
+ };
582
+ });
583
+ };
584
+ // 当队列中任务数大于0时,开始处理队列中的任务
585
+ const addQueue = (time, item) => {
586
+ taskQueue.value.push({ func: item, time });
587
+ if (taskQueue.value.length > 0 && !running.value) {
588
+ running.value = true;
589
+ processQueue();
590
+ }
591
+ };
592
+ const processQueue = () => {
593
+ const item = taskQueue.value.shift();
594
+ if (item?.func) {
595
+ item.func().then(() => {
596
+ console.warn('shift!!!!!!!!!');
597
+ processQueue();
598
+ });
599
+ } else {
600
+ running.value = false;
601
+ }
602
+ };
603
+ const drawBars = () => {
604
+ // AnalyserNode接口的 getByteFrequencyData() 方法将当前频率数据复制到传入的 Uint8Array(无符号字节数组)中。
605
+ analyser.value.getByteFrequencyData(dataArray.value);
606
+ animationFrameId.value = requestAnimationFrame(drawBars);
607
+ };
608
+ // 跳过当前片段
609
+ const skipVoice = async () => {
610
+ // 打断前记录一下打断时间或vad触发事件
611
+ vadStartTime.value = +new Date();
612
+ if (!skipDisabled.value) {
613
+ if (
614
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
615
+ outputData.value[outputData.value.length - 1].audio === ''
616
+ ) {
617
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
618
+ }
619
+ base64List.value = [];
620
+ audioPlayQueue.value = [];
621
+ // 跳过之后,只保留当前时间点两秒内到之后的音频片段
622
+ console.log(
623
+ '截取前长度:',
624
+ taskQueue.value.map(item => item.time)
625
+ );
626
+ let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
627
+ if (startIndex !== -1) {
628
+ taskQueue.value = taskQueue.value.slice(startIndex);
629
+ console.log(
630
+ '截取后长度:',
631
+ taskQueue.value.map(item => item.time),
632
+ vadStartTime.value
633
+ );
634
+ }
635
+ stop.value = true;
636
+ audioDOM.pause();
637
+ setTimeout(() => {
638
+ skipDisabled.value = true;
639
+ }, 300);
640
+ try {
641
+ playing.value = false;
642
+ await stopMessage();
643
+ stop.value = false;
644
+ // playing.value = false;
645
+ buildConnect();
646
+ // cancelAnimationFrame(animationFrameId.value);
647
+ } catch (err) {}
648
+ }
649
+ };
650
+ // 每次call先上传当前用户配置
651
+ const uploadUserConfig = async () => {
652
+ if (!localStorage.getItem('configData')) {
653
+ return new Promise(resolve => resolve());
654
+ }
655
+ const {
656
+ videoQuality,
657
+ useAudioPrompt,
658
+ voiceClonePrompt,
659
+ assistantPrompt,
660
+ vadThreshold,
661
+ audioFormat,
662
+ base64Str
663
+ } = JSON.parse(localStorage.getItem('configData'));
664
+ const obj = {
665
+ messages: [
666
+ {
667
+ role: 'user',
668
+ content: [
669
+ {
670
+ type: 'input_audio',
671
+ input_audio: {
672
+ data: base64Str,
673
+ format: audioFormat
674
+ }
675
+ },
676
+ {
677
+ type: 'options',
678
+ options: {
679
+ hd_video: videoQuality,
680
+ use_audio_prompt: useAudioPrompt,
681
+ vad_threshold: vadThreshold,
682
+ voice_clone_prompt: voiceClonePrompt,
683
+ assistant_prompt: assistantPrompt
684
+ }
685
+ }
686
+ ]
687
+ }
688
+ ]
689
+ };
690
+ const { code, message, data } = await uploadConfig(obj);
691
+ modelVersion.value = data?.choices?.content || '';
692
+ return new Promise((resolve, reject) => {
693
+ if (code !== 0) {
694
+ ElMessage({
695
+ type: 'error',
696
+ message: message,
697
+ duration: 3000,
698
+ customClass: 'system-error'
699
+ });
700
+ reject();
701
+ } else {
702
+ resolve();
703
+ }
704
+ });
705
+ };
706
+ </script>
707
+ <style lang="less" scoped>
708
+ .voice-page {
709
+ flex: 1;
710
+ height: 100%;
711
+ display: flex;
712
+ flex-direction: column;
713
+ &-header {
714
+ display: flex;
715
+ align-items: center;
716
+ justify-content: center;
717
+ padding: 0 16px 16px;
718
+ box-shadow: 0 0.5px 0 0 #e0e0e0;
719
+ margin-bottom: 16px;
720
+ .header-icon {
721
+ display: flex;
722
+ align-items: center;
723
+ img {
724
+ width: 24px;
725
+ height: 24px;
726
+ margin-right: 8px;
727
+ }
728
+ span {
729
+ color: rgba(23, 23, 23, 0.9);
730
+ font-family: PingFang SC;
731
+ font-size: 16px;
732
+ font-style: normal;
733
+ font-weight: 500;
734
+ line-height: normal;
735
+ margin-right: 40px;
736
+ flex-shrink: 0;
737
+ }
738
+ }
739
+ .voice-container {
740
+ display: flex;
741
+ .voice-icon {
742
+ width: 191px;
743
+ height: 45px;
744
+ }
745
+ }
746
+ }
747
+ &-output {
748
+ flex: 1;
749
+ height: 0;
750
+ padding: 0 16px;
751
+ margin-bottom: 16px;
752
+ display: flex;
753
+ flex-direction: column;
754
+ .output-content {
755
+ flex: 1;
756
+ overflow: auto;
757
+ }
758
+ .skip-box {
759
+ display: flex;
760
+ align-items: center;
761
+ justify-content: flex-end;
762
+ margin-top: 16px;
763
+ }
764
+ }
765
+ &-btn {
766
+ text-align: center;
767
+ padding: 8px 0;
768
+ .el-button {
769
+ width: 284px;
770
+ height: 46px;
771
+ border-radius: 8px;
772
+ }
773
+ .el-button.el-button--success {
774
+ background: #647fff;
775
+ border-color: #647fff;
776
+ &:hover {
777
+ opacity: 0.8;
778
+ }
779
+ span {
780
+ color: #fff;
781
+ font-family: PingFang SC;
782
+ font-size: 16px;
783
+ font-style: normal;
784
+ font-weight: 500;
785
+ line-height: normal;
786
+ }
787
+ }
788
+ .el-button.el-button--success.is-disabled {
789
+ background: #f3f3f3;
790
+ border-color: #f3f3f3;
791
+ span {
792
+ color: #d1d1d1;
793
+ }
794
+ }
795
+ .el-button.el-button--danger {
796
+ border-color: #dc3545;
797
+ background-color: #dc3545;
798
+ color: #ffffff;
799
+ font-family: PingFang SC;
800
+ font-size: 16px;
801
+ font-style: normal;
802
+ font-weight: 500;
803
+ line-height: normal;
804
+ .phone-icon {
805
+ margin-right: 10px;
806
+ }
807
+ .btn-text {
808
+ margin-right: 10px;
809
+ }
810
+ .btn-desc {
811
+ margin-right: 16px;
812
+ }
813
+ .time {
814
+ display: flex;
815
+ align-items: center;
816
+ .time-minute,
817
+ .time-second {
818
+ width: 26px;
819
+ height: 26px;
820
+ display: flex;
821
+ justify-content: center;
822
+ align-items: center;
823
+ border-radius: 3.848px;
824
+ background: rgba(47, 47, 47, 0.5);
825
+ }
826
+ .time-colon {
827
+ margin: 0 3px;
828
+ }
829
+ }
830
+ }
831
+ }
832
+ }
833
+ </style>
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/VoiceCall_0105.vue ADDED
@@ -0,0 +1,829 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <template>
2
+ <ExtraInfo webVersion="websocket_0107" :modelVersion="modelVersion" />
3
+ <div class="voice-page">
4
+ <div class="voice-page-header">
5
+ <div class="header-icon">
6
+ <img src="@/assets/images/voice-icon.png" />
7
+ <span>Audio Choice</span>
8
+ </div>
9
+ <div class="voice-container" v-if="!isCalling">
10
+ <SvgIcon name="voice" class="voice-icon" />
11
+ <SvgIcon name="voice" class="voice-icon" />
12
+ <SvgIcon name="voice" class="voice-icon" />
13
+ </div>
14
+ <div class="voice-container" v-else>
15
+ <Voice
16
+ :dataArray="dataArray"
17
+ :isCalling="isCalling"
18
+ :isPlaying="playing"
19
+ :configList="videoConfigList"
20
+ :boxStyle="{ height: '45px' }"
21
+ :itemStyle="{ width: '3px', margin: '0 1px' }"
22
+ />
23
+ </div>
24
+ <!-- <SelectTimbre v-model:timbre="timbre" v-model:audioData="audioData" v-model:disabled="isCalling" /> -->
25
+ </div>
26
+ <div class="voice-page-output">
27
+ <div class="output-content">
28
+ <ModelOutput v-if="outputData.length > 0" :outputData="outputData" containerClass="output-content" />
29
+ </div>
30
+ <div class="skip-box">
31
+ <DelayTips
32
+ v-if="delayTimestamp > 200 || delayCount > 2"
33
+ :delayTimestamp="delayTimestamp"
34
+ :delayCount="delayCount"
35
+ />
36
+ <LikeAndDislike v-model:feedbackStatus="feedbackStatus" v-model:curResponseId="curResponseId" />
37
+ <SkipBtn :disabled="skipDisabled" @click="skipVoice" />
38
+ </div>
39
+ </div>
40
+ <div class="voice-page-btn">
41
+ <el-button v-show="!isCalling" type="success" :disabled="callDisabled" @click="initRecording">
42
+ {{ callDisabled ? 'Not ready yet, please wait' : 'Call MiniCPM' }}
43
+ </el-button>
44
+ <el-button v-show="isCalling" @click="stopRecording" type="danger">
45
+ <SvgIcon name="phone-icon" className="phone-icon" />
46
+ <span class="btn-text">Hang Up</span>
47
+ <CountDown v-model="isCalling" @timeUp="stopRecording" />
48
+ </el-button>
49
+ </div>
50
+ <IdeasList v-if="showIdeasList" :ideasList="voiceIdeasList" />
51
+ </div>
52
+ </template>
53
+ <script setup>
54
+ import { sendMessage, stopMessage, uploadConfig } from '@/apis';
55
+ import { encodeWAV } from '@/hooks/useVoice';
56
+ import { getNewUserId, setNewUserId } from '@/hooks/useRandomId';
57
+ import { fetchEventSource } from '@microsoft/fetch-event-source';
58
+ import { MicVAD } from '@ricky0123/vad-web';
59
+ import { videoConfigList, voiceConfigList, voiceIdeasList, showIdeasList } from '@/enums';
60
+ import { getChunkLength } from '@/utils';
61
+ import { mergeBase64ToBlob } from './merge';
62
+ import WebSocketService from '@/utils/websocket';
63
+
64
+ let ctrl = new AbortController();
65
+ let socket = null;
66
+ const audioData = ref({
67
+ base64Str: '',
68
+ type: 'mp3'
69
+ }); // 自定义音色base64
70
+ const isCalling = defineModel();
71
+ const taskQueue = ref([]);
72
+ const running = ref(false);
73
+ const outputData = ref([]);
74
+ const textQueue = ref('');
75
+ const textAnimationInterval = ref();
76
+
77
+ const isFirstReturn = ref(true); // 首次返回的音频是前端发给后端的音频片段,需要单独处理
78
+
79
+ const audioPlayQueue = ref([]);
80
+ const base64List = ref([]);
81
+ const playing = ref(false);
82
+ const skipDisabled = ref(true);
83
+ const stop = ref(false);
84
+ const timbre = ref([1]);
85
+ const isReturnError = ref(false);
86
+ const allVoice = ref([]);
87
+ const callDisabled = ref(true);
88
+
89
+ const feedbackStatus = ref('');
90
+ const curResponseId = ref('');
91
+ const delayTimestamp = ref(0); // 当前发送片延时
92
+ const delayCount = ref(0); // 当前剩余多少ms未发送到接口
93
+
94
+ const modelVersion = ref('');
95
+
96
+ let audioDOM = new Audio();
97
+
98
+ const isEnd = ref(false); // sse接口关闭,认为模型已完成本次返回
99
+ // 页面卸载时关闭录音
100
+ onBeforeUnmount(() => {
101
+ stopRecording();
102
+ });
103
+ const vadStartTime = ref();
104
+ let myvad = null;
105
+ let vadTimer = null; // vad定时器,用于检测1s内人声是否停止,1s内停止,可认为是vad误触,直接忽略,1s内未停止,则认为是人声,已自动跳过当前对话
106
+ const vadStart = async () => {
107
+ myvad = await MicVAD.new({
108
+ onSpeechStart: () => {
109
+ console.log('Speech start detected');
110
+ if (!skipDisabled.value) {
111
+ vadTimer && clearTimeout(vadTimer);
112
+ vadTimer = setTimeout(() => {
113
+ console.log('打断时间: ', +new Date());
114
+ skipVoice();
115
+ }, 500);
116
+ }
117
+ },
118
+ onSpeechEnd: audio => {
119
+ vadTimer && clearTimeout(vadTimer);
120
+ // debugger;
121
+ // do something with `audio` (Float32Array of audio samples at sample rate 16000)...
122
+ }
123
+ });
124
+ console.log('vad: ', myvad);
125
+ myvad.start();
126
+ };
127
+ onMounted(async () => {
128
+ const { code, message } = await stopMessage();
129
+ if (code !== 0) {
130
+ ElMessage({
131
+ type: 'error',
132
+ message: message,
133
+ duration: 3000,
134
+ customClass: 'system-error'
135
+ });
136
+ return;
137
+ }
138
+ callDisabled.value = false;
139
+ });
140
+ const delay = ms => {
141
+ return new Promise(resolve => setTimeout(resolve, ms));
142
+ };
143
+ const initRecording = async () => {
144
+ uploadUserConfig()
145
+ .then(async () => {
146
+ // 每次call都需要生成新uid
147
+ setNewUserId();
148
+
149
+ outputData.value = [];
150
+ buildConnect();
151
+ isCalling.value = true;
152
+ await delay(100);
153
+ if (socket) {
154
+ socket.close();
155
+ }
156
+ socket = new WebSocketService(
157
+ `/ws/stream${window.location.search}&uid=${getNewUserId()}&service=minicpmo-server`
158
+ );
159
+ socket.connect();
160
+ // 建立连接后稍等一会儿再传送数据
161
+ startRecording();
162
+ if (localStorage.getItem('canStopByVoice') === 'true') {
163
+ vadStart();
164
+ }
165
+ })
166
+ .catch(() => {});
167
+ };
168
+ let audioContext;
169
+ const analyser = ref();
170
+ const dataArray = ref();
171
+ let mediaRecorder;
172
+ let audioChunks = [];
173
+ const animationFrameId = ref();
174
+
175
+ const isFirstPiece = ref(true);
176
+
177
+ const startRecording = async () => {
178
+ // 获取用户音频流
179
+ const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
180
+
181
+ // 创建 AudioContext 和 MediaStreamAudioSourceNode
182
+ audioContext = new (window.AudioContext || window.webkitAudioContext)({ sampleRate: 16000 });
183
+ const source = audioContext.createMediaStreamSource(stream);
184
+
185
+ analyser.value = audioContext.createAnalyser();
186
+ // 将音频节点连接到分析器
187
+ source.connect(analyser.value);
188
+ // 分析器设置
189
+ analyser.value.fftSize = 256;
190
+ const bufferLength = analyser.value.frequencyBinCount;
191
+ dataArray.value = new Uint8Array(bufferLength);
192
+ // 开始绘制音波
193
+ drawBars();
194
+
195
+ // 创建 ScriptProcessorNode 用于捕获音频数据
196
+ const processor = audioContext.createScriptProcessor(256, 1, 1);
197
+
198
+ processor.onaudioprocess = event => {
199
+ if (!isCalling.value) return;
200
+ if (isReturnError.value) {
201
+ stopRecording();
202
+ return;
203
+ }
204
+ const data = event.inputBuffer.getChannelData(0);
205
+ audioChunks.push(new Float32Array(data));
206
+ // 检查是否已经收集到1秒钟的数据
207
+ const totalBufferLength = audioChunks.reduce((total, curr) => total + curr.length, 0);
208
+ const chunkLength = getChunkLength(audioContext.sampleRate);
209
+ if (totalBufferLength >= chunkLength) {
210
+ // 合并到一个完整的数据数组,并裁剪成1秒钟
211
+ const mergedBuffer = mergeBuffers(audioChunks, totalBufferLength);
212
+ const oneSecondBuffer = mergedBuffer.slice(0, chunkLength);
213
+ // 保存并处理成WAV格式
214
+ addQueue(+new Date(), () => saveAudioChunk(oneSecondBuffer, +new Date()));
215
+ // 保留多余的数据备用
216
+ audioChunks = [mergedBuffer.slice(chunkLength)];
217
+ }
218
+ };
219
+
220
+ source.connect(processor);
221
+ processor.connect(audioContext.destination);
222
+ };
223
+ const stopRecording = () => {
224
+ isCalling.value = false;
225
+ if (animationFrameId.value) {
226
+ cancelAnimationFrame(animationFrameId.value);
227
+ }
228
+ if (audioContext && audioContext.state !== 'closed') {
229
+ audioContext.close();
230
+ }
231
+ ctrl.abort();
232
+ ctrl = new AbortController();
233
+ taskQueue.value = [];
234
+ audioPlayQueue.value = [];
235
+ base64List.value = [];
236
+ isReturnError.value = false;
237
+ skipDisabled.value = true;
238
+ playing.value = false;
239
+ audioDOM.pause();
240
+ stopMessage();
241
+ if (socket) {
242
+ socket.close();
243
+ }
244
+ if (
245
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
246
+ outputData.value[outputData.value.length - 1].audio === '' &&
247
+ allVoice.value.length > 0
248
+ ) {
249
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
250
+ }
251
+ myvad && myvad.destroy();
252
+ };
253
+ const getStopValue = () => {
254
+ return stop.value;
255
+ };
256
+ const getPlayingValue = () => {
257
+ return playing.value;
258
+ };
259
+ const getStopStatus = () => {
260
+ return localStorage.getItem('canStopByVoice') === 'true';
261
+ };
262
+ const saveAudioChunk = (buffer, timestamp) => {
263
+ return new Promise(resolve => {
264
+ if (!getStopStatus() && getPlayingValue()) {
265
+ resolve();
266
+ return;
267
+ }
268
+ const wavBlob = encodeWAV(buffer, audioContext.sampleRate);
269
+ let reader = new FileReader();
270
+ reader.readAsDataURL(wavBlob);
271
+
272
+ reader.onloadend = async function () {
273
+ let base64data = reader.result.split(',')[1];
274
+ if (!base64data) {
275
+ resolve();
276
+ return;
277
+ }
278
+ const obj = {
279
+ uid: getNewUserId(),
280
+ messages: [
281
+ {
282
+ role: 'user',
283
+ content: [
284
+ {
285
+ type: 'input_audio',
286
+ input_audio: {
287
+ data: base64data,
288
+ format: 'wav',
289
+ timestamp: String(timestamp)
290
+ }
291
+ }
292
+ ]
293
+ }
294
+ ]
295
+ };
296
+ socket.send(JSON.stringify(obj));
297
+ socket.on('message', data => {
298
+ console.log('message: ', data);
299
+ delayTimestamp.value = +new Date() - timestamp;
300
+ delayCount.value = taskQueue.value.length;
301
+ resolve();
302
+ });
303
+ // 将Base64音频数据发送到后端
304
+ // try {
305
+ // await sendMessage(obj);
306
+ // delayTimestamp.value = +new Date() - timestamp;
307
+ // delayCount.value = taskQueue.value.length;
308
+ // } catch (err) {}
309
+ // resolve();
310
+ };
311
+ });
312
+ };
313
+ const mergeBuffers = (buffers, length) => {
314
+ const result = new Float32Array(length);
315
+ let offset = 0;
316
+ for (let buffer of buffers) {
317
+ result.set(buffer, offset);
318
+ offset += buffer.length;
319
+ }
320
+ return result;
321
+ };
322
+ // 建立连接
323
+ const buildConnect = async () => {
324
+ const obj = {
325
+ messages: [
326
+ {
327
+ role: 'user',
328
+ content: [{ type: 'none' }]
329
+ }
330
+ ],
331
+ stream: true
332
+ };
333
+ isEnd.value = false;
334
+ ctrl.abort();
335
+ ctrl = new AbortController();
336
+ const url = `/api/v1/completions${window.location.search}`;
337
+ fetchEventSource(url, {
338
+ method: 'POST',
339
+ headers: {
340
+ 'Content-Type': 'application/json',
341
+ service: 'minicpmo-server',
342
+ uid: getNewUserId()
343
+ },
344
+ body: JSON.stringify(obj),
345
+ signal: ctrl.signal,
346
+ openWhenHidden: true,
347
+ async onopen(response) {
348
+ console.log('onopen', response);
349
+ isFirstPiece.value = true;
350
+ isFirstReturn.value = true;
351
+ allVoice.value = [];
352
+ base64List.value = [];
353
+ if (response.status !== 200) {
354
+ ElMessage({
355
+ type: 'error',
356
+ message: 'At limit. Please try again soon.',
357
+ duration: 3000,
358
+ customClass: 'system-error'
359
+ });
360
+ isReturnError.value = true;
361
+ } else {
362
+ isReturnError.value = false;
363
+ // skipDisabled.value = false;
364
+ drawText();
365
+ }
366
+ },
367
+ onmessage(msg) {
368
+ const data = JSON.parse(msg.data);
369
+ if (data.response_id) {
370
+ curResponseId.value = data.response_id;
371
+ }
372
+ if (data.choices[0]?.text) {
373
+ textQueue.value += data.choices[0].text.replace('<end>', '');
374
+ console.warn('text return time -------------------------------', +new Date());
375
+ }
376
+ // 首次返回的是前端发给后端的音频片段,需要单独处理
377
+ if (isFirstReturn.value) {
378
+ console.log('第一次');
379
+ isFirstReturn.value = false;
380
+ // 如果后端返回的音频为空,需要重连
381
+ if (!data.choices[0].audio) {
382
+ buildConnect();
383
+ return;
384
+ }
385
+ outputData.value.push({
386
+ type: 'USER',
387
+ audio: `data:audio/wav;base64,${data.choices[0].audio}`
388
+ });
389
+ outputData.value.push({
390
+ type: 'BOT',
391
+ text: '',
392
+ audio: ''
393
+ });
394
+ return;
395
+ }
396
+ if (data.choices[0]?.audio) {
397
+ console.warn('audio return time -------------------------------', +new Date());
398
+ if (!getStopValue() && isCalling.value) {
399
+ skipDisabled.value = false;
400
+ base64List.value.push(`data:audio/wav;base64,${data.choices[0].audio}`);
401
+ addAudioQueue(() => truePlay(data.choices[0].audio));
402
+ }
403
+ allVoice.value.push(`data:audio/wav;base64,${data.choices[0].audio}`);
404
+ } else {
405
+ // 发生异常了,直接重连
406
+ buildConnect();
407
+ }
408
+ if (data.choices[0].text.includes('<end>')) {
409
+ console.log('收到结束标记了:', +new Date());
410
+ if (
411
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
412
+ outputData.value[outputData.value.length - 1].audio === '' &&
413
+ allVoice.value.length > 0
414
+ ) {
415
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
416
+ }
417
+ }
418
+ },
419
+ onclose() {
420
+ console.log('onclose', +new Date());
421
+ isEnd.value = true;
422
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
423
+ vadStartTime.value = +new Date();
424
+ if (audioPlayQueue.value.length === 0) {
425
+ let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
426
+ console.log('taskQueue111111111: ', taskQueue.value, startIndex);
427
+ if (startIndex !== -1) {
428
+ taskQueue.value = taskQueue.value.slice(startIndex);
429
+ console.log('截取后长度:', taskQueue.value, vadStartTime.value);
430
+ }
431
+ buildConnect();
432
+ }
433
+ },
434
+ onerror(err) {
435
+ console.log('onerror', err);
436
+ ctrl.abort();
437
+ ctrl = new AbortController();
438
+ throw err;
439
+ }
440
+ });
441
+ };
442
+ const drawText = async () => {
443
+ if (textQueue.value.length > 0) {
444
+ outputData.value[outputData.value.length - 1].text += textQueue.value[0];
445
+ textQueue.value = textQueue.value.slice(1);
446
+ } else {
447
+ cancelAnimationFrame(textAnimationInterval.value);
448
+ }
449
+ textAnimationInterval.value = requestAnimationFrame(drawText);
450
+ };
451
+ // 返回的语音放到队列里,挨个播放
452
+ const addAudioQueue = async item => {
453
+ audioPlayQueue.value.push(item);
454
+ if (isFirstPiece.value) {
455
+ await delay(500);
456
+ isFirstPiece.value = false;
457
+ }
458
+ if (audioPlayQueue.value.length > 0 && !playing.value) {
459
+ playing.value = true;
460
+ playAudio();
461
+ }
462
+ };
463
+ // 控制播放队列执行
464
+ const playAudio = () => {
465
+ console.log('剩余播放列表:', audioPlayQueue.value, +new Date());
466
+
467
+ if (!isEnd.value && base64List.value.length >= 2) {
468
+ const remainLen = base64List.value.length;
469
+ const blob = mergeBase64ToBlob(base64List.value);
470
+ audioDOM.src = blob;
471
+ audioDOM.play();
472
+ console.error('前期合并后播放开始时间: ', +new Date());
473
+ audioDOM.onended = () => {
474
+ console.error('前期合并后播放结束时间: ', +new Date());
475
+ base64List.value = base64List.value.slice(remainLen);
476
+ audioPlayQueue.value = audioPlayQueue.value.slice(remainLen);
477
+ playAudio();
478
+ };
479
+ return;
480
+ }
481
+ if (isEnd.value && base64List.value.length >= 2) {
482
+ const blob = mergeBase64ToBlob(base64List.value);
483
+ // let audio = new Audio();
484
+ audioDOM.src = blob;
485
+ audioDOM.play();
486
+ console.error('最后合并后播放开始时间: ', +new Date());
487
+ audioDOM.onended = () => {
488
+ console.error('合并后播放结束时间: ', +new Date());
489
+ // URL.revokeObjectURL(url);
490
+ base64List.value = [];
491
+ audioPlayQueue.value = [];
492
+ playing.value = false;
493
+ skipDisabled.value = true;
494
+ if (isCalling.value && !isReturnError.value) {
495
+ // skipDisabled.value = true;
496
+ taskQueue.value = [];
497
+ // 打断前记录一下打断时间或vad触发事件
498
+ // vadStartTime.value = +new Date();
499
+ // // 每次完成后只保留当前时刻往前推1s的语音
500
+ // console.log(
501
+ // '截取前长度:',
502
+ // taskQueue.value.map(item => item.time)
503
+ // );
504
+ // let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
505
+ // if (startIndex !== -1) {
506
+ // taskQueue.value = taskQueue.value.slice(startIndex);
507
+ // console.log(
508
+ // '截取后长度:',
509
+ // taskQueue.value.map(item => item.time),
510
+ // vadStartTime.value
511
+ // );
512
+ // }
513
+ buildConnect();
514
+ }
515
+ };
516
+ return;
517
+ }
518
+ base64List.value.shift();
519
+ const item = audioPlayQueue.value.shift();
520
+ if (item) {
521
+ item().finally(() => playAudio());
522
+ } else {
523
+ playing.value = false;
524
+ if (isEnd.value) {
525
+ console.warn('play done................');
526
+ skipDisabled.value = true;
527
+ }
528
+ // 播放完成后且正在通话且接口未返回错误时开始下一次连接
529
+ if (isEnd.value && isCalling.value && !isReturnError.value) {
530
+ // skipDisabled.value = true;
531
+ taskQueue.value = [];
532
+ // 打断前记录一下打断时间或vad触发事件
533
+ // vadStartTime.value = +new Date();
534
+ // // 每次完成后只保留当前时刻往前推1s的语音
535
+ // console.log(
536
+ // '截取前长度:',
537
+ // taskQueue.value.map(item => item.time)
538
+ // );
539
+ // let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
540
+ // if (startIndex !== -1) {
541
+ // taskQueue.value = taskQueue.value.slice(startIndex);
542
+ // console.log(
543
+ // '截取后长度:',
544
+ // taskQueue.value.map(item => item.time),
545
+ // vadStartTime.value
546
+ // );
547
+ // }
548
+ buildConnect();
549
+ }
550
+ }
551
+ };
552
+
553
+ // 播放音频
554
+ const truePlay = async voice => {
555
+ return new Promise(resolve => {
556
+ audioDOM.src = 'data:audio/wav;base64,' + voice;
557
+ console.error('播放开始时间:', +new Date());
558
+ audioDOM
559
+ .play()
560
+ .then(() => {
561
+ // console.error('播放结束时间: ', +new Date());
562
+ })
563
+ .catch(error => {
564
+ resolve();
565
+ if (error.name === 'NotAllowedError' || error.name === 'SecurityError') {
566
+ console.error('User interaction required or permission issue:', error);
567
+ ElMessage.warning('音频播放失败');
568
+ } else {
569
+ console.error('Error playing audio:', error);
570
+ }
571
+ });
572
+
573
+ audioDOM.onended = () => {
574
+ console.error('播放结束时间: ', +new Date());
575
+ // URL.revokeObjectURL(url);
576
+ resolve();
577
+ };
578
+ });
579
+ };
580
+ // 当队列中任务数大于0时,开始处理队列中的任务
581
+ const addQueue = (time, item) => {
582
+ taskQueue.value.push({ func: item, time });
583
+ if (taskQueue.value.length > 0 && !running.value) {
584
+ running.value = true;
585
+ processQueue();
586
+ }
587
+ };
588
+ const processQueue = () => {
589
+ const item = taskQueue.value.shift();
590
+ if (item?.func) {
591
+ item.func().then(() => {
592
+ console.warn('shift!!!!!!!!!');
593
+ processQueue();
594
+ });
595
+ } else {
596
+ running.value = false;
597
+ }
598
+ };
599
+ const drawBars = () => {
600
+ // AnalyserNode接口的 getByteFrequencyData() 方法将当前频率数据复制到传入的 Uint8Array(无符号字节数组)中。
601
+ analyser.value.getByteFrequencyData(dataArray.value);
602
+ animationFrameId.value = requestAnimationFrame(drawBars);
603
+ };
604
+ // 跳过当前片段
605
+ const skipVoice = async () => {
606
+ // 打断前记录一下打断时间或vad触发事件
607
+ vadStartTime.value = +new Date();
608
+ if (!skipDisabled.value) {
609
+ if (
610
+ outputData.value[outputData.value.length - 1]?.type === 'BOT' &&
611
+ outputData.value[outputData.value.length - 1].audio === ''
612
+ ) {
613
+ outputData.value[outputData.value.length - 1].audio = mergeBase64ToBlob(allVoice.value);
614
+ }
615
+ base64List.value = [];
616
+ audioPlayQueue.value = [];
617
+ // 跳过之后,只保留当前时间点两秒内到之后的音频片段
618
+ console.log(
619
+ '截取前长度:',
620
+ taskQueue.value.map(item => item.time)
621
+ );
622
+ let startIndex = taskQueue.value.findIndex(item => item.time >= vadStartTime.value - 1000);
623
+ if (startIndex !== -1) {
624
+ taskQueue.value = taskQueue.value.slice(startIndex);
625
+ console.log(
626
+ '截取后长度:',
627
+ taskQueue.value.map(item => item.time),
628
+ vadStartTime.value
629
+ );
630
+ }
631
+ stop.value = true;
632
+ audioDOM.pause();
633
+ setTimeout(() => {
634
+ skipDisabled.value = true;
635
+ }, 300);
636
+ try {
637
+ playing.value = false;
638
+ await stopMessage();
639
+ stop.value = false;
640
+ // playing.value = false;
641
+ buildConnect();
642
+ // cancelAnimationFrame(animationFrameId.value);
643
+ } catch (err) {}
644
+ }
645
+ };
646
+ // 每次call先上传当前用户配置
647
+ const uploadUserConfig = async () => {
648
+ if (!localStorage.getItem('configData')) {
649
+ return new Promise(resolve => resolve());
650
+ }
651
+ const {
652
+ videoQuality,
653
+ useAudioPrompt,
654
+ voiceClonePrompt,
655
+ assistantPrompt,
656
+ vadThreshold,
657
+ audioFormat,
658
+ base64Str
659
+ } = JSON.parse(localStorage.getItem('configData'));
660
+ const obj = {
661
+ messages: [
662
+ {
663
+ role: 'user',
664
+ content: [
665
+ {
666
+ type: 'input_audio',
667
+ input_audio: {
668
+ data: base64Str,
669
+ format: audioFormat
670
+ }
671
+ },
672
+ {
673
+ type: 'options',
674
+ options: {
675
+ hd_video: videoQuality,
676
+ use_audio_prompt: useAudioPrompt,
677
+ vad_threshold: vadThreshold,
678
+ voice_clone_prompt: voiceClonePrompt,
679
+ assistant_prompt: assistantPrompt
680
+ }
681
+ }
682
+ ]
683
+ }
684
+ ]
685
+ };
686
+ const { code, message, data } = await uploadConfig(obj);
687
+ modelVersion.value = data?.choices?.content || '';
688
+ return new Promise((resolve, reject) => {
689
+ if (code !== 0) {
690
+ ElMessage({
691
+ type: 'error',
692
+ message: message,
693
+ duration: 3000,
694
+ customClass: 'system-error'
695
+ });
696
+ reject();
697
+ } else {
698
+ resolve();
699
+ }
700
+ });
701
+ };
702
+ </script>
703
+ <style lang="less">
704
+ .voice-page {
705
+ flex: 1;
706
+ height: 100%;
707
+ display: flex;
708
+ flex-direction: column;
709
+ &-header {
710
+ display: flex;
711
+ align-items: center;
712
+ padding: 0 16px 16px;
713
+ box-shadow: 0 0.5px 0 0 #e0e0e0;
714
+ margin-bottom: 16px;
715
+ justify-content: space-between;
716
+ .header-icon {
717
+ display: flex;
718
+ align-items: center;
719
+ img {
720
+ width: 24px;
721
+ height: 24px;
722
+ margin-right: 8px;
723
+ }
724
+ span {
725
+ color: rgba(23, 23, 23, 0.9);
726
+ font-family: PingFang SC;
727
+ font-size: 16px;
728
+ font-style: normal;
729
+ font-weight: 500;
730
+ line-height: normal;
731
+ margin-right: 40px;
732
+ flex-shrink: 0;
733
+ }
734
+ }
735
+ .voice-container {
736
+ display: flex;
737
+ .voice-icon {
738
+ width: 191px;
739
+ height: 45px;
740
+ }
741
+ }
742
+ }
743
+ &-output {
744
+ flex: 1;
745
+ height: 0;
746
+ padding: 0 16px;
747
+ margin-bottom: 16px;
748
+ display: flex;
749
+ flex-direction: column;
750
+ .output-content {
751
+ flex: 1;
752
+ overflow: auto;
753
+ }
754
+ .skip-box {
755
+ display: flex;
756
+ align-items: center;
757
+ justify-content: flex-end;
758
+ margin-top: 16px;
759
+ }
760
+ }
761
+ &-btn {
762
+ text-align: center;
763
+ padding: 8px 0;
764
+ .el-button {
765
+ width: 284px;
766
+ height: 46px;
767
+ border-radius: 8px;
768
+ }
769
+ .el-button.el-button--success {
770
+ background: #647fff;
771
+ border-color: #647fff;
772
+ &:hover {
773
+ opacity: 0.8;
774
+ }
775
+ span {
776
+ color: #fff;
777
+ font-family: PingFang SC;
778
+ font-size: 16px;
779
+ font-style: normal;
780
+ font-weight: 500;
781
+ line-height: normal;
782
+ }
783
+ }
784
+ .el-button.el-button--success.is-disabled {
785
+ background: #f3f3f3;
786
+ border-color: #f3f3f3;
787
+ span {
788
+ color: #d1d1d1;
789
+ }
790
+ }
791
+ .el-button.el-button--danger {
792
+ border-color: #dc3545;
793
+ background-color: #dc3545;
794
+ color: #ffffff;
795
+ font-family: PingFang SC;
796
+ font-size: 16px;
797
+ font-style: normal;
798
+ font-weight: 500;
799
+ line-height: normal;
800
+ .phone-icon {
801
+ margin-right: 10px;
802
+ }
803
+ .btn-text {
804
+ margin-right: 10px;
805
+ }
806
+ .btn-desc {
807
+ margin-right: 16px;
808
+ }
809
+ .time {
810
+ display: flex;
811
+ align-items: center;
812
+ .time-minute,
813
+ .time-second {
814
+ width: 26px;
815
+ height: 26px;
816
+ display: flex;
817
+ justify-content: center;
818
+ align-items: center;
819
+ border-radius: 3.848px;
820
+ background: rgba(47, 47, 47, 0.5);
821
+ }
822
+ .time-colon {
823
+ margin: 0 3px;
824
+ }
825
+ }
826
+ }
827
+ }
828
+ }
829
+ </style>
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/audioBufferToMp3Base64.js ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import lame from '@breezystack/lamejs';
2
+
3
+ export const audioBufferToMp3Base64 = audioBuffer => {
4
+ const mp3Encoder = new lame.Mp3Encoder(1, 16000, 128);
5
+ const sampleBlockSize = 1152;
6
+ const mp3Data = [];
7
+
8
+ for (let i = 0; i < audioBuffer.length; i += sampleBlockSize) {
9
+ const sampleChunk = audioBuffer.subarray(i, i + sampleBlockSize);
10
+ const mp3buf = mp3Encoder.encodeBuffer(sampleChunk);
11
+ if (mp3buf.length > 0) {
12
+ mp3Data.push(new Int8Array(mp3buf));
13
+ }
14
+ }
15
+
16
+ const mp3buf = mp3Encoder.flush();
17
+ if (mp3buf.length > 0) {
18
+ mp3Data.push(new Int8Array(mp3buf));
19
+ }
20
+
21
+ const mp3Blob = new Blob(mp3Data, { type: 'audio/mp3' });
22
+ const url = URL.createObjectURL(mp3Blob);
23
+ let dom = document.querySelector('#voice-box');
24
+ let audio = document.createElement('audio');
25
+ audio.controls = true;
26
+ audio.src = url;
27
+ dom.appendChild(audio);
28
+ return new Promise(resolve => {
29
+ const reader = new FileReader();
30
+ reader.onloadend = () => {
31
+ const base64String = reader.result.split(',')[1];
32
+ resolve(base64String);
33
+ };
34
+ reader.readAsDataURL(mp3Blob);
35
+ });
36
+ };
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/merge.js ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ // Convert Base64 to ArrayBuffer
2
+ const base64ToArrayBuffer = base64 => {
3
+ const binaryString = atob(base64.split(',')[1]); // Remove data URI scheme if present
4
+ const len = binaryString.length;
5
+ const bytes = new Uint8Array(len);
6
+ for (let i = 0; i < len; i++) {
7
+ bytes[i] = binaryString.charCodeAt(i);
8
+ }
9
+ return bytes.buffer;
10
+ };
11
+
12
+ // Parse WAV header and get audio data section
13
+ const parseWav = buffer => {
14
+ const view = new DataView(buffer);
15
+ const format = view.getUint16(20, true);
16
+ const channels = view.getUint16(22, true);
17
+ const sampleRate = view.getUint32(24, true);
18
+ const bitsPerSample = view.getUint16(34, true);
19
+ const dataOffset = 44;
20
+ const dataSize = view.getUint32(40, true);
21
+ const audioData = new Uint8Array(buffer, dataOffset, dataSize);
22
+
23
+ return {
24
+ format,
25
+ channels,
26
+ sampleRate,
27
+ bitsPerSample,
28
+ audioData
29
+ };
30
+ };
31
+
32
+ // Create WAV header for combined audio data
33
+ const createWavHeader = (audioDataSize, sampleRate, channels, bitsPerSample) => {
34
+ const arrayBuffer = new ArrayBuffer(44);
35
+ const view = new DataView(arrayBuffer);
36
+
37
+ const writeString = (view, offset, string) => {
38
+ for (let i = 0; i < string.length; i++) {
39
+ view.setUint8(offset + i, string.charCodeAt(i));
40
+ }
41
+ };
42
+
43
+ writeString(view, 0, 'RIFF'); // ChunkID
44
+ view.setUint32(4, 36 + audioDataSize, true); // ChunkSize
45
+ writeString(view, 8, 'WAVE'); // Format
46
+ writeString(view, 12, 'fmt '); // Subchunk1ID
47
+ view.setUint32(16, 16, true); // Subchunk1Size (PCM)
48
+ view.setUint16(20, 1, true); // AudioFormat (PCM)
49
+ view.setUint16(22, channels, true); // NumChannels
50
+ view.setUint32(24, sampleRate, true); // SampleRate
51
+ view.setUint32(28, (sampleRate * channels * bitsPerSample) / 8, true); // ByteRate
52
+ view.setUint16(32, (channels * bitsPerSample) / 8, true); // BlockAlign
53
+ view.setUint16(34, bitsPerSample, true); // BitsPerSample
54
+ writeString(view, 36, 'data'); // Subchunk2ID
55
+ view.setUint32(40, audioDataSize, true); // Subchunk2Size
56
+
57
+ return arrayBuffer;
58
+ };
59
+
60
+ // Merge multiple Base64 audio files and return a Blob
61
+ const mergeAudioFiles = base64AudioArray => {
62
+ let sampleRate, channels, bitsPerSample;
63
+ let combinedAudioData = new Uint8Array();
64
+
65
+ for (let i = 0; i < base64AudioArray.length; i++) {
66
+ const arrayBuffer = base64ToArrayBuffer(base64AudioArray[i]);
67
+ const wav = parseWav(arrayBuffer);
68
+
69
+ // Initialize properties based on the first audio file
70
+ if (i === 0) {
71
+ sampleRate = wav.sampleRate;
72
+ channels = wav.channels;
73
+ bitsPerSample = wav.bitsPerSample;
74
+ }
75
+
76
+ // Ensure all files have the same format
77
+ if (wav.sampleRate !== sampleRate || wav.channels !== channels || wav.bitsPerSample !== bitsPerSample) {
78
+ throw new Error('All audio files must have the same format.');
79
+ }
80
+
81
+ // Combine audio data
82
+ const newCombinedData = new Uint8Array(combinedAudioData.byteLength + wav.audioData.byteLength);
83
+ newCombinedData.set(combinedAudioData, 0);
84
+ newCombinedData.set(wav.audioData, combinedAudioData.byteLength);
85
+ combinedAudioData = newCombinedData;
86
+ }
87
+
88
+ const combinedAudioDataSize = combinedAudioData.byteLength;
89
+ const wavHeader = createWavHeader(combinedAudioDataSize, sampleRate, channels, bitsPerSample);
90
+ const combinedWavBuffer = new Uint8Array(wavHeader.byteLength + combinedAudioData.byteLength);
91
+ combinedWavBuffer.set(new Uint8Array(wavHeader), 0);
92
+ combinedWavBuffer.set(combinedAudioData, wavHeader.byteLength);
93
+
94
+ // Create a Blob from the combined audio data
95
+ const combinedBlob = new Blob([combinedWavBuffer], { type: 'audio/wav' });
96
+ return combinedBlob;
97
+ };
98
+ export const mergeBase64ToBlob = base64List => {
99
+ const combinedBlob = mergeAudioFiles(base64List);
100
+ const audioUrl = URL.createObjectURL(combinedBlob);
101
+ return audioUrl;
102
+ };
103
+
104
+ // 假设 base64Strings 是一个包含多个 Base64 编码 WAV 文件的数组
105
+ // 注意:这些 Base64 字符串不应该包含 URI 前缀 (例如 "audio/wav;base64,")
106
+ /**
107
+ *
108
+ * @param {Array} base64Strings
109
+ * @returns
110
+ */
111
+ // 解码 Base64 字符串并合并二进制数据
112
+ export const mergeBase64WavFiles = base64Strings => {
113
+ const binaryDataArray = base64Strings.map(base64 => {
114
+ return Uint8Array.from(atob(base64), c => c.charCodeAt(0));
115
+ });
116
+
117
+ const totalLength = binaryDataArray.reduce((sum, arr) => sum + arr.length, 0);
118
+
119
+ const mergedArray = new Uint8Array(totalLength);
120
+ let offset = 0;
121
+
122
+ binaryDataArray.forEach(arr => {
123
+ mergedArray.set(arr, offset);
124
+ offset += arr.length;
125
+ });
126
+
127
+ // 重新编码为 Base64 字符串
128
+ const binaryString = String.fromCharCode(...mergedArray);
129
+ const mergedBase64 = btoa(binaryString);
130
+
131
+ return mergedBase64;
132
+ };
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/components/mergeMp3Base64.js ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ const base64ToArrayBuffer = base64 => {
2
+ let binaryString = atob(base64);
3
+ let len = binaryString.length;
4
+ let bytes = new Uint8Array(len);
5
+ for (let i = 0; i < len; i++) {
6
+ bytes[i] = binaryString.charCodeAt(i);
7
+ }
8
+ return bytes.buffer;
9
+ };
10
+
11
+ const concatenateArrayBuffers = buffers => {
12
+ let totalLength = buffers.reduce((acc, value) => acc + value.byteLength, 0);
13
+ let result = new Uint8Array(totalLength);
14
+ let offset = 0;
15
+ for (let buffer of buffers) {
16
+ result.set(new Uint8Array(buffer), offset);
17
+ offset += buffer.byteLength;
18
+ }
19
+ return result.buffer;
20
+ };
21
+
22
+ export const mergeMp3Base64ToBlob = base64Strings => {
23
+ let arrayBuffers = base64Strings.map(base64ToArrayBuffer);
24
+ let combinedArrayBuffer = concatenateArrayBuffers(arrayBuffers);
25
+ const blob = new Blob([combinedArrayBuffer], { type: 'audio/mp3' });
26
+ const url = URL.createObjectURL(blob);
27
+ console.log('url', url);
28
+ return url;
29
+ };
r1-a/response_generation/minicpm/MiniCPM-o/web_demos/minicpm-o_2.6/web_server/src/views/home/index.vue ADDED
@@ -0,0 +1,262 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <template>
2
+ <div class="home-page">
3
+ <div class="home-page-header">
4
+ <div class="home-page-header-logo">
5
+ <!-- <img src="@/assets/images/logo.png" /> -->
6
+ <SvgIcon name="miniCPM2.6" class="logo-icon" />
7
+ </div>
8
+ <div class="home-page-header-menu">
9
+ <div
10
+ class="home-page-header-menu-item"
11
+ v-for="(item, index) in tabList"
12
+ :key="item.type"
13
+ :class="`home-page-header-menu-item ${activeTab === item.type ? 'active-tab' : ''} ${item.disabled ? 'disabled-tab' : ''}`"
14
+ @click="handleClickTab(item.type, index)"
15
+ >
16
+ {{ getMenuTab(item.type) }}
17
+ </div>
18
+ </div>
19
+
20
+ <div class="home-page-header-switch">
21
+ <div class="change-language">
22
+ <div
23
+ :class="`change-language-item ${language === 'en' ? 'active-language' : ''}`"
24
+ @click="handleChangeLanguage('en')"
25
+ >
26
+ English
27
+ </div>
28
+ <div
29
+ :class="`change-language-item ${language === 'zh' ? 'active-language' : ''}`"
30
+ @click="handleChangeLanguage('zh')"
31
+ >
32
+ 中文
33
+ </div>
34
+ </div>
35
+ </div>
36
+ </div>
37
+ <div :class="`home-page-content ${activeTab === 'chatbot' && 'no-padding'}`">
38
+ <VoiceCallWs v-if="isWebSocket && activeTab === 'voice'" v-model="isCalling" />
39
+ <VoiceCall v-else-if="!isWebSocket && activeTab === 'voice'" v-model="isCalling" />
40
+ <VideoCallWs v-else-if="isWebSocket && activeTab === 'video'" v-model="isCalling" />
41
+ <VideoCall v-else-if="!isWebSocket && activeTab === 'video'" v-model="isCalling" />
42
+ <!-- TODO: https is required to support chatbot in iframe -->
43
+ <iframe
44
+ src="http://127.0.0.1:8000/"
45
+ frameborder="0"
46
+ width="100%"
47
+ height="100%"
48
+ v-else
49
+ />
50
+ <div class="config-box" v-if="activeTab !== 'chatbot'">
51
+ <ModelConfig v-model:isCalling="isCalling" v-model:type="activeTab" />
52
+ </div>
53
+ </div>
54
+ </div>
55
+ </template>
56
+
57
+ <script setup>
58
+ import VoiceCall from './components/VoiceCall.vue';
59
+ import VoiceCallWs from './components/VoiceCall_0105.vue';
60
+ import VideoCall from './components/VideoCall.vue';
61
+ import VideoCallWs from './components/VideoCall_0105.vue';
62
+ import { useI18n } from 'vue-i18n';
63
+ import { useRoute, useRouter } from 'vue-router';
64
+
65
+ const route = useRoute();
66
+ const router = useRouter();
67
+ const typeObj = {
68
+ 0: 'video',
69
+ 1: 'voice',
70
+ 2: 'chatbot'
71
+ };
72
+ const defaultType = typeObj[route.query.type] || 'voice';
73
+
74
+ const { t, locale } = useI18n();
75
+ const activeTab = ref(defaultType);
76
+ const language = ref(localStorage.getItem('language') || 'zh');
77
+ const isWebSocket = false;
78
+ const tabList = ref([
79
+ {
80
+ type: 'video',
81
+ text: 'Realtime Video Call'
82
+ },
83
+ {
84
+ type: 'voice',
85
+ text: 'Realtime Voice Call'
86
+ },
87
+ {
88
+ type: 'chatbot',
89
+ text: 'Chatbot'
90
+ // disabled: true
91
+ }
92
+ ]);
93
+ const isCalling = ref(false);
94
+ const handleChangeLanguage = val => {
95
+ console.log('val: ', val);
96
+ language.value = val;
97
+ locale.value = val;
98
+ localStorage.setItem('language', val);
99
+ };
100
+ const getMenuTab = val => {
101
+ let text = '';
102
+ switch (val) {
103
+ case 'video':
104
+ text = t('menuTabVideo');
105
+ break;
106
+ case 'voice':
107
+ text = t('menuTabAudio');
108
+ break;
109
+ case 'chatbot':
110
+ text = t('menuTabChatbot');
111
+ break;
112
+ default:
113
+ break;
114
+ }
115
+ return text;
116
+ };
117
+ const handleClickTab = (val, index) => {
118
+ activeTab.value = val;
119
+ const port = route.query.port;
120
+ const type = index;
121
+ router.push({
122
+ path: '/',
123
+ query: {
124
+ port,
125
+ type
126
+ }
127
+ });
128
+ };
129
+ </script>
130
+
131
+ <style lang="less" scoped>
132
+ .home-page {
133
+ width: 100%;
134
+ height: 100%;
135
+ display: flex;
136
+ flex-direction: column;
137
+ &-header {
138
+ display: flex;
139
+ align-items: center;
140
+ &-logo {
141
+ width: 174px;
142
+ height: 46px;
143
+ display: flex;
144
+ align-items: center;
145
+ justify-content: center;
146
+ border-radius: 12px;
147
+ background: #ffffff;
148
+ flex-shrink: 0;
149
+ padding: 0 24px;
150
+ .logo-icon {
151
+ width: 100%;
152
+ height: 100%;
153
+ }
154
+ }
155
+ &-menu {
156
+ display: flex;
157
+ align-items: center;
158
+ margin-left: 16px;
159
+ &-item {
160
+ width: 260px;
161
+ height: 46px;
162
+ display: flex;
163
+ align-items: center;
164
+ justify-content: center;
165
+ background: #ffffff;
166
+ color: #252525;
167
+ font-family: PingFang SC;
168
+ font-size: 16px;
169
+ font-style: normal;
170
+ font-weight: 400;
171
+ line-height: normal;
172
+ border: 1px solid #dde1eb;
173
+ cursor: pointer;
174
+ user-select: none;
175
+ }
176
+ &-item + &-item {
177
+ border-left: none;
178
+ }
179
+ &-item:first-of-type {
180
+ border-radius: 12px 0 0 12px;
181
+ }
182
+ &-item:last-of-type {
183
+ border-radius: 0 12px 12px 0;
184
+ }
185
+ .active-tab {
186
+ color: #ffffff;
187
+ background: linear-gradient(90deg, #789efe 0.02%, #647fff 75.28%);
188
+ font-weight: 500;
189
+ }
190
+ .disabled-tab {
191
+ cursor: not-allowed;
192
+ border-color: #dde1eb;
193
+ color: #d1d1d1;
194
+ }
195
+ }
196
+ &-switch {
197
+ flex: 1;
198
+ display: flex;
199
+ align-items: center;
200
+ justify-content: flex-end;
201
+ .change-language {
202
+ display: flex;
203
+ align-items: center;
204
+ &-item {
205
+ width: 80px;
206
+ height: 32px;
207
+ display: flex;
208
+ justify-content: center;
209
+ align-items: center;
210
+ border: 1px solid #dde1eb;
211
+ background: #ffffff;
212
+ color: #252525;
213
+ font-family: PingFang SC;
214
+ font-size: 14px;
215
+ font-weight: 400;
216
+ line-height: normal;
217
+ cursor: pointer;
218
+ user-select: none;
219
+ }
220
+ &-item:first-of-type {
221
+ border-right: none;
222
+ border-radius: 12px 0 0 12px;
223
+ }
224
+ &-item:last-of-type {
225
+ border-radius: 0 12px 12px 0;
226
+ }
227
+ &-item.active-language {
228
+ color: #ffffff;
229
+ background: linear-gradient(90deg, #789efe 0.02%, #647fff 75.28%);
230
+ }
231
+ }
232
+ }
233
+ }
234
+ &-content {
235
+ flex: 1;
236
+ height: 0;
237
+ border-radius: 12px;
238
+ margin-top: 16px;
239
+ background: #ffffff;
240
+ padding: 18px;
241
+ display: flex;
242
+ .config-box {
243
+ width: 322px;
244
+ margin-left: 16px;
245
+ // border-left: 1px solid black;
246
+ box-shadow: -0.5px 0 0 0 #e0e0e0;
247
+ overflow: auto;
248
+ }
249
+ }
250
+ .no-padding {
251
+ padding: 0;
252
+ overflow: hidden;
253
+ background: #ffffff;
254
+ }
255
+ }
256
+ </style>
257
+ <style lang="less">
258
+ .el-popover.el-popper.config-popover {
259
+ padding: 18px;
260
+ border-radius: 12px;
261
+ }
262
+ </style>