cwchang commited on
Commit
8c3fde1
·
0 Parent(s):

feat: 初始化 Whisper 語音轉文字 Web 應用

Browse files

- 整合 faster-whisper 模型進行語音轉錄
- Flask 後端 API 處理音訊檔案上傳與轉錄
- 蘋果風格極簡前端介面(Bento Grid 佈局)
- OpenCC 簡繁轉換,支援繁體中文輸出
- 支援多種音訊格式:MP3, WAV, OGG, FLAC, M4A, WebM

Files changed (7) hide show
  1. .gitignore +39 -0
  2. README.md +200 -0
  3. app.py +139 -0
  4. requirements.txt +4 -0
  5. start.sh +29 -0
  6. templates/index.html +876 -0
  7. uploads/.gitkeep +0 -0
.gitignore ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Virtual environment
2
+ .venv/
3
+ venv/
4
+ env/
5
+
6
+ # Python
7
+ __pycache__/
8
+ *.py[cod]
9
+ *$py.class
10
+ *.so
11
+ .Python
12
+ *.egg-info/
13
+ dist/
14
+ build/
15
+
16
+ # Uploads
17
+ uploads/*
18
+ !uploads/.gitkeep
19
+
20
+ # IDE
21
+ .vscode/
22
+ .idea/
23
+ *.swp
24
+ *.swo
25
+
26
+ # OS
27
+ .DS_Store
28
+ Thumbs.db
29
+
30
+ # Logs
31
+ *.log
32
+
33
+ # Environment variables
34
+ .env
35
+ .env.local
36
+
37
+ # Models (downloaded automatically)
38
+ *.bin
39
+ *.pt
README.md ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎙️ Whisper 語音轉文字服務
2
+
3
+ 基於 OpenAI Whisper large-v3 turbo 模型的語音轉文字 Web 應用。使用 faster-whisper 優化版本,提供快速、準確的語音識別服務。
4
+
5
+ ## ✨ 功能特點
6
+
7
+ - 🚀 快速轉錄:使用優化的 faster-whisper 引擎
8
+ - 🌍 多語言支持:支持中文、英文、日文、韓文等多種語言
9
+ - 🎯 自動語言檢測:無需手動指定語言
10
+ - 📊 分段顯示:提供時間戳和文字分段
11
+ - 🎨 美觀界面:現代化的 Web UI
12
+ - 📁 拖放上傳:支持拖放文件上傳
13
+
14
+ ## 📋 系統需求
15
+
16
+ - Python 3.11+
17
+ - uv (Python 包管理器)
18
+ - 至少 4GB 可用內存
19
+ - 支持的音頻格式:MP3, WAV, OGG, FLAC, M4A, WebM
20
+
21
+ ## 🚀 快速開始
22
+
23
+ ### 1. 安裝依賴
24
+
25
+ 虛擬環境和依賴已經設置完成,如需重新安裝:
26
+
27
+ ```bash
28
+ # 創建虛擬環境
29
+ uv venv
30
+
31
+ # 安裝依賴
32
+ uv pip install faster-whisper flask flask-cors
33
+ ```
34
+
35
+ ### 2. 啟動服務
36
+
37
+ 使用提供的啟動腳本:
38
+
39
+ ```bash
40
+ ./start.sh
41
+ ```
42
+
43
+ 或手動啟動:
44
+
45
+ ```bash
46
+ # 激活虛擬環境
47
+ source .venv/bin/activate
48
+
49
+ # 啟動應用
50
+ python app.py
51
+ ```
52
+
53
+ ### 3. 訪問應用
54
+
55
+ 在瀏覽器中打開:
56
+ ```
57
+ http://localhost:5000
58
+ ```
59
+
60
+ ## 📖 使用說明
61
+
62
+ 1. **上傳音頻**
63
+ - 點擊上傳區域選擇文件
64
+ - 或直接拖放音頻文件到上傳區域
65
+
66
+ 2. **設置選項**
67
+ - **語言**:選擇音頻語言或留空自動檢測
68
+ - **束搜索大小**:數值越大精度越高但速度越慢(推薦使用 5)
69
+
70
+ 3. **開始轉錄**
71
+ - 點擊「開始轉錄」按鈕
72
+ - 首次使用會自動下載模型(約 1.7GB)
73
+ - 等待轉錄完成
74
+
75
+ 4. **查看結果**
76
+ - 完整文字:所有轉錄內容
77
+ - 分段詳情:帶時間戳的文字片段
78
+
79
+ ## 🔧 配置說明
80
+
81
+ ### 修改服務端口
82
+
83
+ 編輯 `app.py` 文件的最後一行:
84
+
85
+ ```python
86
+ app.run(debug=True, host='0.0.0.0', port=5000) # 修改 port 值
87
+ ```
88
+
89
+ ### 調整模型精度
90
+
91
+ 在 `app.py` 中修改 `load_model()` 函數:
92
+
93
+ ```python
94
+ # 選項: "float16", "int8", "int8_float16"
95
+ model = WhisperModel("dropbox-dash/faster-whisper-large-v3-turbo", compute_type="int8")
96
+ ```
97
+
98
+ - `float16`:最高精度,速度較慢,內存佔用大
99
+ - `int8`:平衡精度和速度(推薦)
100
+ - `int8_float16`:較快速度,略低精度
101
+
102
+ ### 文件大小限制
103
+
104
+ 編輯 `app.py` 中的配置:
105
+
106
+ ```python
107
+ MAX_FILE_SIZE = 100 * 1024 * 1024 # 100MB,根據需要調整
108
+ ```
109
+
110
+ ## 📁 項目結構
111
+
112
+ ```
113
+ faster_whisper/
114
+ ├── .venv/ # 虛擬環境
115
+ ├── app.py # Flask 後端應用
116
+ ├── templates/
117
+ │ └── index.html # 前端界面
118
+ ├── uploads/ # 臨時文件存儲
119
+ ├── start.sh # 啟動腳本
120
+ └── README.md # 說明文檔
121
+ ```
122
+
123
+ ## 🔌 API 端點
124
+
125
+ ### POST /api/transcribe
126
+
127
+ 轉錄音頻文件
128
+
129
+ **請求參數:**
130
+ - `audio` (file): 音頻文件
131
+ - `language` (string, 可選): 語言代碼(如 "zh", "en")
132
+ - `beam_size` (int, 可選): 束搜索大小(默認 5)
133
+
134
+ **響應示例:**
135
+ ```json
136
+ {
137
+ "success": true,
138
+ "language": "zh",
139
+ "duration": 120.5,
140
+ "full_text": "完整的轉錄文字...",
141
+ "segments": [
142
+ {
143
+ "start": 0.0,
144
+ "end": 3.5,
145
+ "text": "第一段文字"
146
+ }
147
+ ]
148
+ }
149
+ ```
150
+
151
+ ### GET /api/health
152
+
153
+ 檢查服務狀態
154
+
155
+ **響應示例:**
156
+ ```json
157
+ {
158
+ "status": "ok",
159
+ "model_loaded": true
160
+ }
161
+ ```
162
+
163
+ ## ⚠️ 注意事項
164
+
165
+ 1. **首次運行**:首次轉錄時會自動下載模型(約 1.7GB),請確保網絡連接穩定
166
+ 2. **內存需求**:運行時至少需要 4GB 可用內存
167
+ 3. **文件清理**:上傳的臨時文件會在轉錄後自動刪除
168
+ 4. **GPU 加速**:如有 NVIDIA GPU,faster-whisper 會自動使用 CUDA 加速
169
+
170
+ ## 🐛 常見問題
171
+
172
+ ### Q: 模型下載失敗?
173
+ A: 檢查網絡連接,或手動從 Hugging Face 下載模型並放置到本地。
174
+
175
+ ### Q: 內存不足?
176
+ A: 使用 `compute_type="int8"` 降低內存佔用,或處理較短的音頻片段。
177
+
178
+ ### Q: 轉錄速度慢?
179
+ A:
180
+ - 降低 `beam_size` 值
181
+ - 使用 `compute_type="int8"`
182
+ - 如有 GPU,確保 CUDA 已正確安裝
183
+
184
+ ### Q: 支持哪些語言?
185
+ A: Whisper 支持 99 種語言,常見的包括:中文、英文、日文、韓文、西班牙文、法文、德文等。
186
+
187
+ ## 📚 參考資料
188
+
189
+ - [Faster Whisper GitHub](https://github.com/systran/faster-whisper)
190
+ - [OpenAI Whisper](https://github.com/openai/whisper)
191
+ - [Model on Hugging Face](https://huggingface.co/dropbox-dash/faster-whisper-large-v3-turbo)
192
+ - [CTranslate2 Documentation](https://opennmt.net/CTranslate2/)
193
+
194
+ ## 📄 授權
195
+
196
+ MIT License
197
+
198
+ ## 🤝 貢獻
199
+
200
+ 歡迎提交 Issue 和 Pull Request!
app.py ADDED
@@ -0,0 +1,139 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, request, jsonify, render_template
2
+ from flask_cors import CORS
3
+ from faster_whisper import WhisperModel
4
+ from opencc import OpenCC
5
+ import os
6
+ import tempfile
7
+ from werkzeug.utils import secure_filename
8
+
9
+ app = Flask(__name__)
10
+ CORS(app)
11
+
12
+ # 配置
13
+ UPLOAD_FOLDER = 'uploads'
14
+ ALLOWED_EXTENSIONS = {'mp3', 'wav', 'ogg', 'flac', 'm4a', 'webm'}
15
+ MAX_FILE_SIZE = 100 * 1024 * 1024 # 100MB
16
+
17
+ os.makedirs(UPLOAD_FOLDER, exist_ok=True)
18
+
19
+ # 全局模型變數
20
+ model = None
21
+
22
+ # OpenCC 轉換器(簡體轉繁體台灣)
23
+ cc_s2tw = OpenCC('s2tw') # 簡體到繁體(台灣)
24
+ cc_s2twp = OpenCC('s2twp') # 簡體到繁體(台灣)含常用詞彙轉換
25
+
26
+
27
+ def convert_to_traditional(text, use_phrases=True):
28
+ """將簡體中文轉換為繁體中文(台灣)"""
29
+ if use_phrases:
30
+ return cc_s2twp.convert(text)
31
+ return cc_s2tw.convert(text)
32
+
33
+ def allowed_file(filename):
34
+ return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
35
+
36
+ def load_model():
37
+ """延迟加载模型"""
38
+ global model
39
+ if model is None:
40
+ print("正在加载 Whisper 模型...")
41
+ model = WhisperModel("dropbox-dash/faster-whisper-large-v3-turbo", compute_type="int8")
42
+ print("模型加载完成!")
43
+ return model
44
+
45
+ @app.route('/')
46
+ def index():
47
+ return render_template('index.html')
48
+
49
+ @app.route('/api/transcribe', methods=['POST'])
50
+ def transcribe():
51
+ try:
52
+ # 檢查是否有檔案
53
+ if 'audio' not in request.files:
54
+ return jsonify({'error': '沒有上傳檔案'}), 400
55
+
56
+ file = request.files['audio']
57
+
58
+ if file.filename == '':
59
+ return jsonify({'error': '沒有選擇檔案'}), 400
60
+
61
+ if not allowed_file(file.filename):
62
+ return jsonify({'error': f'不支援的檔案格式。支援的格式: {", ".join(ALLOWED_EXTENSIONS)}'}), 400
63
+
64
+ # 獲取參數
65
+ language = request.form.get('language', None)
66
+ beam_size = int(request.form.get('beam_size', 5))
67
+ to_traditional = request.form.get('to_traditional', 'true').lower() == 'true'
68
+
69
+ # 儲存臨時檔案
70
+ filename = secure_filename(file.filename)
71
+ filepath = os.path.join(UPLOAD_FOLDER, filename)
72
+ file.save(filepath)
73
+
74
+ try:
75
+ # 載入模型並轉錄
76
+ whisper_model = load_model()
77
+
78
+ # 使用 initial_prompt 引導輸出繁體中文
79
+ initial_prompt = "以下是普通話的轉錄內容。" if language == 'zh' else None
80
+
81
+ segments, info = whisper_model.transcribe(
82
+ filepath,
83
+ language=language if language else None,
84
+ beam_size=beam_size,
85
+ vad_filter=True,
86
+ initial_prompt=initial_prompt
87
+ )
88
+
89
+ # 收集結果
90
+ results = []
91
+ full_text = ""
92
+
93
+ for segment in segments:
94
+ text = segment.text.strip()
95
+
96
+ # 如果是中文且啟用繁體轉換
97
+ if to_traditional and info.language == 'zh':
98
+ text = convert_to_traditional(text)
99
+
100
+ segment_data = {
101
+ 'start': round(segment.start, 2),
102
+ 'end': round(segment.end, 2),
103
+ 'text': text
104
+ }
105
+ results.append(segment_data)
106
+ full_text += text + " "
107
+
108
+ # 刪除臨時檔案
109
+ os.remove(filepath)
110
+
111
+ return jsonify({
112
+ 'success': True,
113
+ 'language': info.language,
114
+ 'duration': round(info.duration, 2),
115
+ 'full_text': full_text.strip(),
116
+ 'segments': results
117
+ })
118
+
119
+ except Exception as e:
120
+ # 清理臨時檔案
121
+ if os.path.exists(filepath):
122
+ os.remove(filepath)
123
+ raise e
124
+
125
+ except Exception as e:
126
+ return jsonify({'error': f'轉錄失敗: {str(e)}'}), 500
127
+
128
+ @app.route('/api/health', methods=['GET'])
129
+ def health():
130
+ return jsonify({'status': 'ok', 'model_loaded': model is not None})
131
+
132
+ if __name__ == '__main__':
133
+ print("=" * 50)
134
+ print("Whisper 語音轉文字服務")
135
+ print("=" * 50)
136
+ print("服務將在 http://localhost:5000 啟動")
137
+ print("首次轉錄時會自動下載模型,請耐心等待...")
138
+ print("=" * 50)
139
+ app.run(debug=True, host='0.0.0.0', port=5000)
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ faster-whisper==1.2.1
2
+ flask==3.1.2
3
+ flask-cors==6.0.1
4
+ opencc-python-reimplemented==0.1.7
start.sh ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+
3
+ echo "========================================"
4
+ echo " Whisper 語音轉文字服務啟動腳本"
5
+ echo "========================================"
6
+ echo ""
7
+
8
+ # 激活虛擬環境
9
+ echo "正在激活虛擬環境..."
10
+ source .venv/bin/activate
11
+
12
+ # 檢查依賴
13
+ if ! python -c "import faster_whisper" 2>/dev/null; then
14
+ echo "錯誤: faster_whisper 未安裝"
15
+ echo "請運行: uv pip install faster-whisper flask flask-cors"
16
+ exit 1
17
+ fi
18
+
19
+ echo "虛擬環境已激活"
20
+ echo ""
21
+ echo "啟動 Flask 服務器..."
22
+ echo "服務地址: http://localhost:5000"
23
+ echo ""
24
+ echo "按 Ctrl+C 停止服務"
25
+ echo "========================================"
26
+ echo ""
27
+
28
+ # 啟動應用
29
+ python app.py
templates/index.html ADDED
@@ -0,0 +1,876 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="zh-TW">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Whisper 語音轉文字</title>
7
+ <style>
8
+ :root {
9
+ --primary: #0071e3;
10
+ --primary-hover: #0077ed;
11
+ --text-primary: #1d1d1f;
12
+ --text-secondary: #86868b;
13
+ --bg-primary: #ffffff;
14
+ --bg-secondary: #f5f5f7;
15
+ --bg-tertiary: #fbfbfd;
16
+ --border: #d2d2d7;
17
+ --border-light: #e8e8ed;
18
+ --success: #34c759;
19
+ --error: #ff3b30;
20
+ --radius-sm: 12px;
21
+ --radius-md: 16px;
22
+ --radius-lg: 24px;
23
+ }
24
+
25
+ * {
26
+ margin: 0;
27
+ padding: 0;
28
+ box-sizing: border-box;
29
+ }
30
+
31
+ body {
32
+ font-family: -apple-system, BlinkMacSystemFont, 'SF Pro Display', 'SF Pro Text', 'Helvetica Neue', Arial, sans-serif;
33
+ background: var(--bg-secondary);
34
+ min-height: 100vh;
35
+ color: var(--text-primary);
36
+ line-height: 1.5;
37
+ -webkit-font-smoothing: antialiased;
38
+ }
39
+
40
+ .page {
41
+ max-width: 1200px;
42
+ margin: 0 auto;
43
+ padding: 60px 24px;
44
+ }
45
+
46
+ .header {
47
+ text-align: center;
48
+ margin-bottom: 48px;
49
+ }
50
+
51
+ .header h1 {
52
+ font-size: 48px;
53
+ font-weight: 600;
54
+ letter-spacing: -0.02em;
55
+ margin-bottom: 12px;
56
+ }
57
+
58
+ .header p {
59
+ font-size: 19px;
60
+ color: var(--text-secondary);
61
+ font-weight: 400;
62
+ }
63
+
64
+ /* Bento Grid */
65
+ .bento-grid {
66
+ display: grid;
67
+ grid-template-columns: repeat(12, 1fr);
68
+ gap: 16px;
69
+ }
70
+
71
+ .bento-card {
72
+ background: var(--bg-primary);
73
+ border-radius: var(--radius-lg);
74
+ padding: 32px;
75
+ transition: transform 0.2s ease, box-shadow 0.2s ease;
76
+ }
77
+
78
+ .bento-card:hover {
79
+ transform: translateY(-2px);
80
+ box-shadow: 0 8px 30px rgba(0, 0, 0, 0.08);
81
+ }
82
+
83
+ /* Upload Card - Large */
84
+ .upload-card {
85
+ grid-column: span 8;
86
+ min-height: 320px;
87
+ display: flex;
88
+ flex-direction: column;
89
+ }
90
+
91
+ .upload-area {
92
+ flex: 1;
93
+ border: 2px dashed var(--border);
94
+ border-radius: var(--radius-md);
95
+ display: flex;
96
+ flex-direction: column;
97
+ align-items: center;
98
+ justify-content: center;
99
+ cursor: pointer;
100
+ transition: all 0.2s ease;
101
+ background: var(--bg-tertiary);
102
+ }
103
+
104
+ .upload-area:hover {
105
+ border-color: var(--primary);
106
+ background: rgba(0, 113, 227, 0.02);
107
+ }
108
+
109
+ .upload-area.dragover {
110
+ border-color: var(--primary);
111
+ background: rgba(0, 113, 227, 0.05);
112
+ }
113
+
114
+ .upload-area.has-file {
115
+ border-style: solid;
116
+ border-color: var(--success);
117
+ background: rgba(52, 199, 89, 0.05);
118
+ }
119
+
120
+ .upload-icon {
121
+ width: 48px;
122
+ height: 48px;
123
+ color: var(--text-secondary);
124
+ margin-bottom: 16px;
125
+ transition: color 0.2s ease;
126
+ }
127
+
128
+ .upload-area:hover .upload-icon {
129
+ color: var(--primary);
130
+ }
131
+
132
+ .upload-area.has-file .upload-icon {
133
+ color: var(--success);
134
+ }
135
+
136
+ .upload-title {
137
+ font-size: 17px;
138
+ font-weight: 500;
139
+ margin-bottom: 6px;
140
+ }
141
+
142
+ .upload-subtitle {
143
+ font-size: 14px;
144
+ color: var(--text-secondary);
145
+ }
146
+
147
+ .file-info {
148
+ display: none;
149
+ margin-top: 20px;
150
+ padding: 16px;
151
+ background: var(--bg-secondary);
152
+ border-radius: var(--radius-sm);
153
+ }
154
+
155
+ .file-info.active {
156
+ display: flex;
157
+ align-items: center;
158
+ gap: 12px;
159
+ }
160
+
161
+ .file-icon {
162
+ width: 20px;
163
+ height: 20px;
164
+ color: var(--primary);
165
+ }
166
+
167
+ .file-name {
168
+ font-size: 14px;
169
+ font-weight: 500;
170
+ flex: 1;
171
+ overflow: hidden;
172
+ text-overflow: ellipsis;
173
+ white-space: nowrap;
174
+ }
175
+
176
+ .file-remove {
177
+ width: 20px;
178
+ height: 20px;
179
+ color: var(--text-secondary);
180
+ cursor: pointer;
181
+ transition: color 0.2s ease;
182
+ }
183
+
184
+ .file-remove:hover {
185
+ color: var(--error);
186
+ }
187
+
188
+ #fileInput {
189
+ display: none;
190
+ }
191
+
192
+ /* Settings Card */
193
+ .settings-card {
194
+ grid-column: span 4;
195
+ display: flex;
196
+ flex-direction: column;
197
+ gap: 24px;
198
+ }
199
+
200
+ .card-label {
201
+ display: flex;
202
+ align-items: center;
203
+ gap: 8px;
204
+ font-size: 13px;
205
+ font-weight: 500;
206
+ color: var(--text-secondary);
207
+ text-transform: uppercase;
208
+ letter-spacing: 0.02em;
209
+ margin-bottom: 20px;
210
+ }
211
+
212
+ .card-label-icon {
213
+ width: 16px;
214
+ height: 16px;
215
+ }
216
+
217
+ .form-group {
218
+ display: flex;
219
+ flex-direction: column;
220
+ gap: 8px;
221
+ }
222
+
223
+ .form-group label {
224
+ font-size: 14px;
225
+ font-weight: 500;
226
+ color: var(--text-primary);
227
+ }
228
+
229
+ select {
230
+ appearance: none;
231
+ background: var(--bg-secondary);
232
+ border: 1px solid var(--border-light);
233
+ border-radius: var(--radius-sm);
234
+ padding: 12px 40px 12px 16px;
235
+ font-size: 15px;
236
+ font-family: inherit;
237
+ color: var(--text-primary);
238
+ cursor: pointer;
239
+ transition: border-color 0.2s ease;
240
+ background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='16' height='16' viewBox='0 0 24 24' fill='none' stroke='%2386868b' stroke-width='2' stroke-linecap='round' stroke-linejoin='round'%3E%3Cpath d='m6 9 6 6 6-6'/%3E%3C/svg%3E");
241
+ background-repeat: no-repeat;
242
+ background-position: right 12px center;
243
+ }
244
+
245
+ select:focus {
246
+ outline: none;
247
+ border-color: var(--primary);
248
+ }
249
+
250
+ select:hover {
251
+ border-color: var(--text-secondary);
252
+ }
253
+
254
+ /* Action Card */
255
+ .action-card {
256
+ grid-column: span 12;
257
+ display: flex;
258
+ align-items: center;
259
+ justify-content: space-between;
260
+ padding: 24px 32px;
261
+ }
262
+
263
+ .action-info {
264
+ display: flex;
265
+ align-items: center;
266
+ gap: 12px;
267
+ color: var(--text-secondary);
268
+ font-size: 14px;
269
+ }
270
+
271
+ .action-info-icon {
272
+ width: 20px;
273
+ height: 20px;
274
+ }
275
+
276
+ .btn {
277
+ display: inline-flex;
278
+ align-items: center;
279
+ gap: 8px;
280
+ background: var(--primary);
281
+ color: white;
282
+ padding: 14px 28px;
283
+ border: none;
284
+ border-radius: 980px;
285
+ font-size: 15px;
286
+ font-weight: 500;
287
+ font-family: inherit;
288
+ cursor: pointer;
289
+ transition: all 0.2s ease;
290
+ }
291
+
292
+ .btn:hover:not(:disabled) {
293
+ background: var(--primary-hover);
294
+ transform: scale(1.02);
295
+ }
296
+
297
+ .btn:disabled {
298
+ background: var(--border);
299
+ cursor: not-allowed;
300
+ }
301
+
302
+ .btn-icon {
303
+ width: 18px;
304
+ height: 18px;
305
+ }
306
+
307
+ /* Loading State */
308
+ .loading {
309
+ display: none;
310
+ grid-column: span 12;
311
+ text-align: center;
312
+ padding: 48px;
313
+ }
314
+
315
+ .loading.active {
316
+ display: block;
317
+ }
318
+
319
+ .spinner {
320
+ width: 32px;
321
+ height: 32px;
322
+ border: 3px solid var(--border-light);
323
+ border-top-color: var(--primary);
324
+ border-radius: 50%;
325
+ animation: spin 0.8s linear infinite;
326
+ margin: 0 auto 16px;
327
+ }
328
+
329
+ @keyframes spin {
330
+ to { transform: rotate(360deg); }
331
+ }
332
+
333
+ .loading-text {
334
+ font-size: 15px;
335
+ color: var(--text-secondary);
336
+ }
337
+
338
+ /* Error State */
339
+ .error-card {
340
+ display: none;
341
+ grid-column: span 12;
342
+ background: rgba(255, 59, 48, 0.08);
343
+ border: 1px solid rgba(255, 59, 48, 0.2);
344
+ padding: 20px 24px;
345
+ }
346
+
347
+ .error-card.active {
348
+ display: flex;
349
+ align-items: center;
350
+ gap: 12px;
351
+ }
352
+
353
+ .error-icon {
354
+ width: 20px;
355
+ height: 20px;
356
+ color: var(--error);
357
+ flex-shrink: 0;
358
+ }
359
+
360
+ .error-text {
361
+ font-size: 14px;
362
+ color: var(--error);
363
+ }
364
+
365
+ /* Result Section */
366
+ .result-section {
367
+ display: none;
368
+ margin-top: 32px;
369
+ }
370
+
371
+ .result-section.active {
372
+ display: block;
373
+ }
374
+
375
+ .result-header {
376
+ display: flex;
377
+ align-items: center;
378
+ justify-content: space-between;
379
+ margin-bottom: 16px;
380
+ }
381
+
382
+ .result-title {
383
+ font-size: 24px;
384
+ font-weight: 600;
385
+ }
386
+
387
+ .result-meta {
388
+ display: flex;
389
+ gap: 24px;
390
+ }
391
+
392
+ .meta-item {
393
+ display: flex;
394
+ align-items: center;
395
+ gap: 6px;
396
+ font-size: 14px;
397
+ color: var(--text-secondary);
398
+ }
399
+
400
+ .meta-icon {
401
+ width: 16px;
402
+ height: 16px;
403
+ }
404
+
405
+ .result-grid {
406
+ display: grid;
407
+ grid-template-columns: repeat(12, 1fr);
408
+ gap: 16px;
409
+ }
410
+
411
+ /* Transcript Card */
412
+ .transcript-card {
413
+ grid-column: span 8;
414
+ }
415
+
416
+ .transcript-content {
417
+ font-size: 17px;
418
+ line-height: 1.7;
419
+ color: var(--text-primary);
420
+ white-space: pre-wrap;
421
+ }
422
+
423
+ /* Segments Card */
424
+ .segments-card {
425
+ grid-column: span 4;
426
+ max-height: 480px;
427
+ overflow-y: auto;
428
+ }
429
+
430
+ .segments-list {
431
+ display: flex;
432
+ flex-direction: column;
433
+ gap: 12px;
434
+ }
435
+
436
+ .segment-item {
437
+ padding: 16px;
438
+ background: var(--bg-secondary);
439
+ border-radius: var(--radius-sm);
440
+ transition: background 0.2s ease;
441
+ }
442
+
443
+ .segment-item:hover {
444
+ background: var(--bg-tertiary);
445
+ }
446
+
447
+ .segment-time {
448
+ display: inline-flex;
449
+ align-items: center;
450
+ gap: 4px;
451
+ font-size: 12px;
452
+ font-weight: 500;
453
+ color: var(--primary);
454
+ margin-bottom: 6px;
455
+ font-variant-numeric: tabular-nums;
456
+ }
457
+
458
+ .segment-time-icon {
459
+ width: 12px;
460
+ height: 12px;
461
+ }
462
+
463
+ .segment-text {
464
+ font-size: 14px;
465
+ color: var(--text-primary);
466
+ line-height: 1.5;
467
+ }
468
+
469
+ /* Scrollbar */
470
+ .segments-card::-webkit-scrollbar {
471
+ width: 6px;
472
+ }
473
+
474
+ .segments-card::-webkit-scrollbar-track {
475
+ background: transparent;
476
+ }
477
+
478
+ .segments-card::-webkit-scrollbar-thumb {
479
+ background: var(--border);
480
+ border-radius: 3px;
481
+ }
482
+
483
+ /* Responsive */
484
+ @media (max-width: 968px) {
485
+ .upload-card,
486
+ .settings-card,
487
+ .transcript-card,
488
+ .segments-card {
489
+ grid-column: span 12;
490
+ }
491
+
492
+ .header h1 {
493
+ font-size: 36px;
494
+ }
495
+
496
+ .segments-card {
497
+ max-height: none;
498
+ }
499
+ }
500
+
501
+ @media (max-width: 600px) {
502
+ .page {
503
+ padding: 32px 16px;
504
+ }
505
+
506
+ .header h1 {
507
+ font-size: 28px;
508
+ }
509
+
510
+ .header p {
511
+ font-size: 16px;
512
+ }
513
+
514
+ .bento-card {
515
+ padding: 24px;
516
+ }
517
+
518
+ .action-card {
519
+ flex-direction: column;
520
+ gap: 16px;
521
+ }
522
+
523
+ .action-info {
524
+ text-align: center;
525
+ }
526
+
527
+ .btn {
528
+ width: 100%;
529
+ justify-content: center;
530
+ }
531
+ }
532
+ </style>
533
+ </head>
534
+ <body>
535
+ <div class="page">
536
+ <header class="header">
537
+ <h1>語音轉文字</h1>
538
+ <p>上傳音訊檔案,取得精準的文字轉錄</p>
539
+ </header>
540
+
541
+ <div class="bento-grid">
542
+ <!-- Upload Card -->
543
+ <div class="bento-card upload-card">
544
+ <div class="upload-area" id="uploadArea">
545
+ <svg class="upload-icon" id="uploadIcon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round">
546
+ <path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/>
547
+ <polyline points="17 8 12 3 7 8"/>
548
+ <line x1="12" y1="3" x2="12" y2="15"/>
549
+ </svg>
550
+ <p class="upload-title" id="uploadTitle">點按或拖放檔案至此處</p>
551
+ <p class="upload-subtitle">支援 MP3、WAV、OGG、FLAC、M4A、WebM</p>
552
+ <input type="file" id="fileInput" accept="audio/*">
553
+ </div>
554
+ <div class="file-info" id="fileInfo">
555
+ <svg class="file-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
556
+ <path d="M9 18V5l12-2v13"/>
557
+ <circle cx="6" cy="18" r="3"/>
558
+ <circle cx="18" cy="16" r="3"/>
559
+ </svg>
560
+ <span class="file-name" id="fileName"></span>
561
+ <svg class="file-remove" id="fileRemove" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
562
+ <line x1="18" y1="6" x2="6" y2="18"/>
563
+ <line x1="6" y1="6" x2="18" y2="18"/>
564
+ </svg>
565
+ </div>
566
+ </div>
567
+
568
+ <!-- Settings Card -->
569
+ <div class="bento-card settings-card">
570
+ <div class="card-label">
571
+ <svg class="card-label-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
572
+ <path d="M12.22 2h-.44a2 2 0 0 0-2 2v.18a2 2 0 0 1-1 1.73l-.43.25a2 2 0 0 1-2 0l-.15-.08a2 2 0 0 0-2.73.73l-.22.38a2 2 0 0 0 .73 2.73l.15.1a2 2 0 0 1 1 1.72v.51a2 2 0 0 1-1 1.74l-.15.09a2 2 0 0 0-.73 2.73l.22.38a2 2 0 0 0 2.73.73l.15-.08a2 2 0 0 1 2 0l.43.25a2 2 0 0 1 1 1.73V20a2 2 0 0 0 2 2h.44a2 2 0 0 0 2-2v-.18a2 2 0 0 1 1-1.73l.43-.25a2 2 0 0 1 2 0l.15.08a2 2 0 0 0 2.73-.73l.22-.39a2 2 0 0 0-.73-2.73l-.15-.08a2 2 0 0 1-1-1.74v-.5a2 2 0 0 1 1-1.74l.15-.09a2 2 0 0 0 .73-2.73l-.22-.38a2 2 0 0 0-2.73-.73l-.15.08a2 2 0 0 1-2 0l-.43-.25a2 2 0 0 1-1-1.73V4a2 2 0 0 0-2-2z"/>
573
+ <circle cx="12" cy="12" r="3"/>
574
+ </svg>
575
+ 設定
576
+ </div>
577
+ <div class="form-group">
578
+ <label for="language">語言</label>
579
+ <select id="language">
580
+ <option value="">自動偵測</option>
581
+ <option value="zh">中文</option>
582
+ <option value="en">English</option>
583
+ <option value="ja">日本語</option>
584
+ <option value="ko">한국어</option>
585
+ <option value="es">Espanol</option>
586
+ <option value="fr">Francais</option>
587
+ <option value="de">Deutsch</option>
588
+ </select>
589
+ </div>
590
+ <div class="form-group">
591
+ <label for="beamSize">精確度</label>
592
+ <select id="beamSize">
593
+ <option value="1">快速</option>
594
+ <option value="3">標準</option>
595
+ <option value="5" selected>平衡</option>
596
+ <option value="8">精確</option>
597
+ <option value="10">最高</option>
598
+ </select>
599
+ </div>
600
+ <div class="form-group">
601
+ <label for="toTraditional">中文輸出</label>
602
+ <select id="toTraditional">
603
+ <option value="true" selected>繁體中文</option>
604
+ <option value="false">保持原樣</option>
605
+ </select>
606
+ </div>
607
+ </div>
608
+
609
+ <!-- Action Card -->
610
+ <div class="bento-card action-card">
611
+ <div class="action-info">
612
+ <svg class="action-info-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
613
+ <circle cx="12" cy="12" r="10"/>
614
+ <path d="M12 16v-4"/>
615
+ <path d="M12 8h.01"/>
616
+ </svg>
617
+ <span id="actionInfoText">選擇檔案後即可開始轉錄</span>
618
+ </div>
619
+ <button class="btn" id="transcribeBtn" disabled>
620
+ <svg class="btn-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
621
+ <polygon points="5 3 19 12 5 21 5 3"/>
622
+ </svg>
623
+ 開始轉錄
624
+ </button>
625
+ </div>
626
+
627
+ <!-- Loading State -->
628
+ <div class="bento-card loading" id="loading">
629
+ <div class="spinner"></div>
630
+ <p class="loading-text">正在處理音訊,請稍候</p>
631
+ </div>
632
+
633
+ <!-- Error State -->
634
+ <div class="bento-card error-card" id="error">
635
+ <svg class="error-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
636
+ <circle cx="12" cy="12" r="10"/>
637
+ <line x1="12" y1="8" x2="12" y2="12"/>
638
+ <line x1="12" y1="16" x2="12.01" y2="16"/>
639
+ </svg>
640
+ <span class="error-text" id="errorText"></span>
641
+ </div>
642
+ </div>
643
+
644
+ <!-- Result Section -->
645
+ <section class="result-section" id="result">
646
+ <div class="result-header">
647
+ <h2 class="result-title">轉錄結果</h2>
648
+ <div class="result-meta">
649
+ <div class="meta-item">
650
+ <svg class="meta-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
651
+ <circle cx="12" cy="12" r="10"/>
652
+ <path d="M2 12h20"/>
653
+ <path d="M12 2a15.3 15.3 0 0 1 4 10 15.3 15.3 0 0 1-4 10 15.3 15.3 0 0 1-4-10 15.3 15.3 0 0 1 4-10z"/>
654
+ </svg>
655
+ <span id="detectedLanguage">-</span>
656
+ </div>
657
+ <div class="meta-item">
658
+ <svg class="meta-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
659
+ <circle cx="12" cy="12" r="10"/>
660
+ <polyline points="12 6 12 12 16 14"/>
661
+ </svg>
662
+ <span id="duration">-</span>
663
+ </div>
664
+ </div>
665
+ </div>
666
+
667
+ <div class="result-grid">
668
+ <div class="bento-card transcript-card">
669
+ <div class="card-label">
670
+ <svg class="card-label-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
671
+ <path d="M14.5 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V7.5L14.5 2z"/>
672
+ <polyline points="14 2 14 8 20 8"/>
673
+ <line x1="16" y1="13" x2="8" y2="13"/>
674
+ <line x1="16" y1="17" x2="8" y2="17"/>
675
+ <line x1="10" y1="9" x2="8" y2="9"/>
676
+ </svg>
677
+ 完整內容
678
+ </div>
679
+ <div class="transcript-content" id="fullText"></div>
680
+ </div>
681
+
682
+ <div class="bento-card segments-card">
683
+ <div class="card-label">
684
+ <svg class="card-label-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
685
+ <line x1="8" y1="6" x2="21" y2="6"/>
686
+ <line x1="8" y1="12" x2="21" y2="12"/>
687
+ <line x1="8" y1="18" x2="21" y2="18"/>
688
+ <line x1="3" y1="6" x2="3.01" y2="6"/>
689
+ <line x1="3" y1="12" x2="3.01" y2="12"/>
690
+ <line x1="3" y1="18" x2="3.01" y2="18"/>
691
+ </svg>
692
+ 時間軸
693
+ </div>
694
+ <div class="segments-list" id="segments"></div>
695
+ </div>
696
+ </div>
697
+ </section>
698
+ </div>
699
+
700
+ <script>
701
+ const uploadArea = document.getElementById('uploadArea');
702
+ const uploadIcon = document.getElementById('uploadIcon');
703
+ const uploadTitle = document.getElementById('uploadTitle');
704
+ const fileInput = document.getElementById('fileInput');
705
+ const fileInfo = document.getElementById('fileInfo');
706
+ const fileName = document.getElementById('fileName');
707
+ const fileRemove = document.getElementById('fileRemove');
708
+ const transcribeBtn = document.getElementById('transcribeBtn');
709
+ const actionInfoText = document.getElementById('actionInfoText');
710
+ const loading = document.getElementById('loading');
711
+ const result = document.getElementById('result');
712
+ const error = document.getElementById('error');
713
+ const errorText = document.getElementById('errorText');
714
+
715
+ let selectedFile = null;
716
+
717
+ // Upload area click
718
+ uploadArea.addEventListener('click', () => fileInput.click());
719
+
720
+ // File selection
721
+ fileInput.addEventListener('change', (e) => {
722
+ handleFile(e.target.files[0]);
723
+ });
724
+
725
+ // Drag and drop
726
+ uploadArea.addEventListener('dragover', (e) => {
727
+ e.preventDefault();
728
+ uploadArea.classList.add('dragover');
729
+ });
730
+
731
+ uploadArea.addEventListener('dragleave', () => {
732
+ uploadArea.classList.remove('dragover');
733
+ });
734
+
735
+ uploadArea.addEventListener('drop', (e) => {
736
+ e.preventDefault();
737
+ uploadArea.classList.remove('dragover');
738
+ handleFile(e.dataTransfer.files[0]);
739
+ });
740
+
741
+ // Remove file
742
+ fileRemove.addEventListener('click', (e) => {
743
+ e.stopPropagation();
744
+ clearFile();
745
+ });
746
+
747
+ function handleFile(file) {
748
+ if (!file) return;
749
+
750
+ selectedFile = file;
751
+ fileName.textContent = file.name;
752
+ fileInfo.classList.add('active');
753
+ uploadArea.classList.add('has-file');
754
+ uploadTitle.textContent = '檔案已選取';
755
+
756
+ // Update upload icon to checkmark
757
+ uploadIcon.innerHTML = '<path d="M22 11.08V12a10 10 0 1 1-5.93-9.14"/><polyline points="22 4 12 14.01 9 11.01"/>';
758
+
759
+ transcribeBtn.disabled = false;
760
+ actionInfoText.textContent = '準備就緒,可開始轉錄';
761
+
762
+ result.classList.remove('active');
763
+ error.classList.remove('active');
764
+ }
765
+
766
+ function clearFile() {
767
+ selectedFile = null;
768
+ fileInput.value = '';
769
+ fileName.textContent = '';
770
+ fileInfo.classList.remove('active');
771
+ uploadArea.classList.remove('has-file');
772
+ uploadTitle.textContent = '點按或拖放檔案至此處';
773
+
774
+ // Reset upload icon
775
+ uploadIcon.innerHTML = '<path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/><polyline points="17 8 12 3 7 8"/><line x1="12" y1="3" x2="12" y2="15"/>';
776
+
777
+ transcribeBtn.disabled = true;
778
+ actionInfoText.textContent = '選擇檔案後即可開始轉錄';
779
+ }
780
+
781
+ // Transcribe button
782
+ transcribeBtn.addEventListener('click', async () => {
783
+ if (!selectedFile) return;
784
+
785
+ const formData = new FormData();
786
+ formData.append('audio', selectedFile);
787
+ formData.append('language', document.getElementById('language').value);
788
+ formData.append('beam_size', document.getElementById('beamSize').value);
789
+ formData.append('to_traditional', document.getElementById('toTraditional').value);
790
+
791
+ transcribeBtn.disabled = true;
792
+ loading.classList.add('active');
793
+ result.classList.remove('active');
794
+ error.classList.remove('active');
795
+
796
+ try {
797
+ const response = await fetch('/api/transcribe', {
798
+ method: 'POST',
799
+ body: formData
800
+ });
801
+
802
+ const data = await response.json();
803
+
804
+ if (data.success) {
805
+ displayResult(data);
806
+ } else {
807
+ showError(data.error || '轉錄失敗');
808
+ }
809
+ } catch (err) {
810
+ showError('網路錯誤:' + err.message);
811
+ } finally {
812
+ loading.classList.remove('active');
813
+ transcribeBtn.disabled = false;
814
+ }
815
+ });
816
+
817
+ function displayResult(data) {
818
+ const langMap = {
819
+ 'zh': '中文',
820
+ 'en': 'English',
821
+ 'ja': '日本語',
822
+ 'ko': '한국어',
823
+ 'es': 'Espanol',
824
+ 'fr': 'Francais',
825
+ 'de': 'Deutsch'
826
+ };
827
+
828
+ document.getElementById('detectedLanguage').textContent = langMap[data.language] || data.language.toUpperCase();
829
+ document.getElementById('duration').textContent = formatDuration(data.duration);
830
+ document.getElementById('fullText').textContent = data.full_text;
831
+
832
+ const segmentsDiv = document.getElementById('segments');
833
+ segmentsDiv.innerHTML = '';
834
+
835
+ data.segments.forEach(segment => {
836
+ const segmentDiv = document.createElement('div');
837
+ segmentDiv.className = 'segment-item';
838
+ segmentDiv.innerHTML = `
839
+ <div class="segment-time">
840
+ <svg class="segment-time-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
841
+ <circle cx="12" cy="12" r="10"/>
842
+ <polyline points="12 6 12 12 16 14"/>
843
+ </svg>
844
+ ${formatTime(segment.start)} - ${formatTime(segment.end)}
845
+ </div>
846
+ <div class="segment-text">${segment.text}</div>
847
+ `;
848
+ segmentsDiv.appendChild(segmentDiv);
849
+ });
850
+
851
+ result.classList.add('active');
852
+ actionInfoText.textContent = '轉錄完成';
853
+ }
854
+
855
+ function formatTime(seconds) {
856
+ const mins = Math.floor(seconds / 60);
857
+ const secs = Math.floor(seconds % 60);
858
+ return `${mins}:${secs.toString().padStart(2, '0')}`;
859
+ }
860
+
861
+ function formatDuration(seconds) {
862
+ if (seconds < 60) {
863
+ return `${Math.round(seconds)} 秒`;
864
+ }
865
+ const mins = Math.floor(seconds / 60);
866
+ const secs = Math.round(seconds % 60);
867
+ return `${mins} 分 ${secs} 秒`;
868
+ }
869
+
870
+ function showError(message) {
871
+ errorText.textContent = message;
872
+ error.classList.add('active');
873
+ }
874
+ </script>
875
+ </body>
876
+ </html>
uploads/.gitkeep ADDED
File without changes